^ PCT/USOO/06456 



®9/9l4oo: 



CLASS II DNA METHYLTRANSFERASES OF ZEA MAYS 



FIELD OF THE INVENTION 
5 The present invention relates to nucleic acid and amino acid sequences which 

^ encode class II DNA methyltransferases. The present invention further relates to 

methods of using the nucleic acid and amino acid sequences described herein to 
stabilize iransgene expression in transgenic plants, to alter the yield or biochemical 
qualities of plants and to silence targeted genes in plants in vivo. 

10 

BACKGROUND OF THE INVENTION 
The information content of a primary DNA sequence can be enhanced by the 
;p addition of a methyl group to the ring structure of cytosine or adenine residues 

jif (Finnegan. E.J.. et al.. Annu. Rev. Plant Physiol. Plant Mol. Biol. 49:223-47 ( 1 998)). 

15 The chemical modification of DNA is known to affect protein-DNA interactions. 
Specifically, in prokaryotes, methylation of DNA prevents cleavage by the cognate 
M= restriction endonucleases. Id. In higher eukaryotes, cytosine methylation can inhibit 

r^v; binding of regulatory proteins and methylation of promoter and coding sequences of 

N' genes can repress transcription, both />? v/rro and //2 v/v'o. Id. Methylation of DNA 

ff i 20 has been implicated in the timing of DNA replication, in determination of chromatin 
structure, in increasine mutation frequencv. as a causal aeent for some human 
diseases, and as a basis for epigenetic phenomena. Id. 



Eukaryotic genomes are not methylated uniformly, but instead contain specific 
25 methylated regions, with other domains remaining unmethylated (Martienssen. R.A.. 
et al.. Current Opinion in Genetics and Development, 5:234-242 (1995)). The 
enzymes that transfer methyl groups to the cytosine ring are cytosine-5- 
methy I transferases (hereinafter referred to as "DNA methyltransferases") and have 
been characterized from a number of eukaroytes. All characterized eukaryotic DNA 
30 methyltransferases exhibit little primary sequence specificity in vitro other than the 
short canonical symmetrical sites methylated which are CpG in animals, and CpG and 
CpNpG in plants (where N stands for any nucleotide). Mammalian and plant 
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genomes contain methylation-free GC-rich zones, or CpG islands, which are 
frequently associated with the 5' regions of housekeeping genes. Id. 

In plants. DNA methylation is necessary for normal development. For 
5 example, Arabidopsis having reduced levels of DNA methylation demonstrate a range 
of abnormalities, including loss of apical dominance, reduced stature, altered leaf size 
and shape, reduced root length, homeodc transformation of floral organs and reduced 
fertility (Finnegan. E.J.. et al., Annii. Rev. Plant Physiol. Plant Mol. Biol. 49:223-47 
(1998)). Moreover, Arabidopsis plants in which methylation had been reduced by at 
10 least 70% became infertile after four to five generations of selfing. Id. A comparable 
reduction in DNA methylation is embryo lethal in mammals. Id. 

Two classes of DNA methyltransferase enzymes have been cloned in plants 
(Finnegan, E.J.. et al.. Annu. Rev. Plant Physiol. Plant Mol. Biol. 49:223-47 (1998)) - 

15 class I and class II. Class I enzymes include Metl and Metll from Arabidopsis 

(Finnegan et al. Nucleic Acids Res.. 21(10):2383-2388 (1993); Nebendahl, et al., Gene 
157(l-2):269-272 (1995)), Met 1-5 and Met2-21 from carrot (Bemacchia, G etal.. 
Plant Physiol. 1 16:446-446 (1998)), C-5 MTase from tomato (Bemacchia, G et al. 
Plant J., 13(3):3 17-330 (1998)), and C-5 MTase from pea (Pradhan et al.. Nucleic 

20 Acids Res.. 26(5): 12 14-1222 (1998)). Class II sequences have been detected in many 
species with a defining characteristic of the presence of an embedded chromodomain 
(Rose et al.. Nucleic Acids Res.. 26(7): 1628-1635 (1998)). The only full-length class 
II sequence is Cmtl from Arabidopsis (Genbank ??.A.F039364). 

25 Class 1 enzymes are homologous to dnmtl from mice (Bestor, T., et al., 

EMBO J..U (7):26 11-2617 (1 988)), the first cloned DNA methyltransferase. A 
knockout of dnmtl in mice resulted in lethality during embryogenesis (Li et al.. Cell, 
69(6):9 15-926 (1992)). Dnmtl has been used as a model for all class 1 enzymes 
though it has not been proven whether this is appropriate in plant systems. Antisense 

30 expression of Metl in Arabidopsis resulted in numerous developmental abnormalities 
(Finnegan et a!., Proc. Nad. Acad Sci. U.S.A., 93(16):8449-8454 (1996)). Class I 
enzymes are thought to function as maintenance enzymes, though proteolytic 
cleavage could create de novo enzymes (Bestor. T.H., EMBO J., \\ (7):26 1 1 -26 1 7 
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(1992)). CpG activity has been shown for dnmtl in mice and humans. In peas it was 
found that pea C-5 MTase expressed in baculovirus displayed both CpG and CpNpG 
activity (Pradhan et al.. Nucleic Acids Res., 26(5): 1214-1222 (1998)). In general, 
class I enzymes have a high level of expression in tissues that are actively dividing 
5 and are expressed at lower levels or silent in mature tissues. 

There is little known regarding the function of class II enzymes. Cmtl was 
detected as an Arabidopsis genomic sequence based on sequence homology to other 
methyltransferases. The C-terminal region contains the conserved methyltransferase 

10 domains and a chromodomain. The N-terminal region is much shorter than the N- 
terminal region of class I enzymes. Several commonly used ecotypes of Arabidopsis 
contain an allele of Cmtl which is interrupted by a transposon insertion. These Cmtl 
knockouts do not ha\ e any detectable phenotype. No other research has been 
published on the function of class II enzymes. Cmtl is expressed only in floral tissues 

15 at very low levels. Degenerate PGR has been used to show the presence of Cmtl 
homologs in a number of other plant species (Rose et al., Nucleic Acids Res., 
26(7): 1 628- 1 635 ( 1 998)). In addition to finding homologs in other species, two 
sequences with similarity to Cmtl, Cmt2 and Cmt3, were identified in the 
Arabidopsis. 

20 

DNA methylation provides a mechanism for the mitotic propagation of 
epigenetic states. Epigenetic lineage-dependent patterns of gene expression have 
been studied the most in the germline and in somatic cell lineages in multicellular 
eukaryotes (Martienssen, R.A., et al, Curr. Opin. Genet, and Develop., 5:234-242 

25 (1995)). For example, in mice, the parentally imprinted genes H19 and Igf2r are 
expressed in the embryo only when they are inherited via the female gamete. Id. In 
contrast, the Igf2 gene is expressed only when inherited via the male gamete. Id. The 
human homologs of the Igfl and H19 genes are linked and parentally impnnted as in 
the mouse. Id. Parental uniparental disomy for this chromosomal region (1 lpl5) is 

30 associated with Beckwith-Wiedemann syndrome, which is believed to result from 
overexpression of /g/2. Id. In addition to overgrowth of certain organs, Beckwith- 
Wiedemann syndrome patients have a 700-fold predisposition to Wilms' tumor, and 
loss of heterozygosity in this region is found in many other tumors as well. Id. It has 
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also been shown that 60-70% of Wilms' tumor patients have biallelic expression of 
Igf2. HI 9. or both in tumor tissue, resulting from loss of imprinting rather than loss of 
heterozygosity. Id. 

5 In plants, epigenetic changes in gene expression are considered to be easier to 

observe than in animals since there is little cell migration and clonal lineages stay 
together. Id. Moreover, because in plants the gemnline arises relatively late in 
development, many somatically variegated phenotypes can be followed into the next 
generation and are heritable to greater or lesser extents. Id. Parental imprinting of 
0 gene expression was first observed in plants at the R locus in maize. Id. Certain 
alleles condition a mottled phenotype in the alerone layer of the extra-embryonic 
endosperm when inherited paternally, but cause a fully colored phenotype when 
inherited maternally. Id. Genetic studies of modifier loci have revealed that it is the 
maternally inherited R allele that is imprinted to a high level of expression. Id. High 
levels of R expression correlate with demethylation of sites in the transcribed region 
in the maternally inherited allele. Id. 

Plants transformed with additional copies of endogenous genes or with 
multiple copies of a foreign or exogenous gene (these endogenous and exogenous 
genes are often referred to as "transgenes") frequently display epigenetic inactivation. 
This phenomenon is known as "gene silencing" or "co-suppression". There are two 
types of "gene silencing" or "co-suppression". The first is "transcriptional 
silencing". In "transcriptional silencing", RNA production from the introduced 
transgene is repressed. The second type of "gene silencing" is "posttranscriptional 
silencing". In "posttranscriptional silencing", transcripts do not accumulate in the 
cytoplasm even though transcription rates are comparable with or are higher than 
those in cells where transcripts do accumulate. 

Transcriptional silencing is associated with transgene methylation, particularly 
in the promoter (Finnegan, E.J., et al.,^/i/ju. Rev. Plant Physiol. Plant Mol. Biol. 
49:223-47 (1998)). Posttranscriptional silencing, which affects both transgenes and 
homologous endogeneous genes, is also associated with transgene methylation. but 
within the coding sequence rather than the promoter. Id. It is believed that both 
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fomis of gene silencing reflect normal, cellular defenses against invading or mobile 
DNAs. Id. 

Currently, two classes of methyltransferase genes have been cloned in maize. 
5 The class I clone homolog is referred to as Zmetl and the class II homolog Zmet2. 
The Zmetl is a class I enzyme that was cloned by Paula Olhoft and Ron Phillips at the 
University of Minnesota. FIG. 4 is a summary of the major classes of 5-cytosine 
methyltransferases from mammals. Arabidposis and maize. The present invention 
herein relates to zmet2a and zmel2b methyltransferases. 

10 

SUMMARY OF THE INVENTION 
In one embodiment, the present invention relates to an isolated and purified 
Zea mays zmet2a methyltransferase nucleic acid sequence. Specitlcally, the isolated 
and purified Zea mays zmet2a methyltransferase nucleic acid sequence of the present 
15 invention hybridizes to the nucleic acid sequences shown in FIG. lA and IB under 
stringent conditions. The zmet2a methyltransferase nucleic acid sequence encodes 
the enzyme zmet2a methyltransferase. The amino acid sequences for zmet2a 
methyltransferase is shown in FIG. 2A and FIG. 2B. 

20 In another embodiment, the present invention further relates to recombinant 

expression cassettes comprising the isolated and purified zmet2a nucleic acid 
sequence described herein. Preferably, the recombinant expression cassettes further 
contain a promoter sequence and a polyadenylation signal sequence. The promoter 
sequence can be operably linked to the zmet2a nucleic acid sequence. The zmet2a 

25 nucleic acid sequence is operably linked to the polyadenylation signal sequence. Any 
promoter sequence can be used in the recombinant expression cassette, such as, but 
not limited to a constimtive or tissue specific promoter. 

In another embodiment, the present invention also relates to a recombinant 
30 expression cassettes comprising one or more heterologous nucleic acid sequences. 
Such recombinant expression cassettes further contain a promoter sequence from the 
zmet2a nucleic acid sequence and a polyadenylation signal sequence. The promoter 
sequence is operably linked to the heterologous nucleic acid sequence. The 
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heterologous nucleic acid sequence is operably linked to the polyadenylation signal 
sequence. Any heterologous promoter sequence can be used in this recombinant 
expression cassette. 

In a funher embodiment, the present invention also relates to bacterial cells 
comprising at least one of the recombinant expression cassettes described herein. The 
bacterial cells can be Agrobacteriiim tumefaciens or Agrobacierium rhizogenes. 



In a further embodiment, the present invention further relates to transgenic 
10 plant cells and transgenic plants containing the recombinant expression cassettes 
described herein. Monocotyledonous or dicotyledonous plant cells and plants can be 
transformed with the hereinbefore described recombinant expression cassettes. Plants 
which can be transformed with the recombinant expression cassettes of the present 
invention include, but are not limited to. Zea mays, Oiyza saliva, Secale cereale, 
15 Thticum aesiivwn, Daucus caroia, Brassica oleracea, Cuciimis melo, Cucumis 
sativiis. Laiuca saliva. Solarium tubersoum, Lycopersicon esculenlum, Phaseolus 
vulgaris. Brassica napiis, etc. The present invention also relates to seed resulting 
from the transgenic plants of the present invention. 

20 in a further embodiment, the present invention further provides methods of 

reducing or altering methyltransferase activity in a transgenic plant in order to 
increase transgene expression stability and/or to improve the yield or biochemical 
qualities of a plant as well as a method of silencing targeted genes in a plant in vivo. 
These methods comprise introducing into a plant a recombinant expression cassette 

25 comprising an appropriate plant promoter operably linked to a zmet2a 

methyltransferase nucleic acid sequence described herein in either the sense or 
antisense direction. 



In a further embodiment, the present invention relates to an isolated and 
30 purified Zea mays zmet2b methyltransferase nucleic acid sequence. The zmet2b 

methyltransferase nucleic acid sequence of the present invention can be isolated using 
an isolated and purified partial Zea mays zmet2b methyltransferase nucleic acid 
sequence. The isolated and purified partial Zea mays 2meb2b methyltransferase 
6 
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nucleic acid sequence can be used as a probe to isolate the zmet2b methyltransferase 
nucleic acid encoding zmet2b methyltransferase. Preferably, the isolated and purified 
partial Zea mays zmei2b methyltransferase nucleic acid described herein hybridizes to 
FIG. 23 under stringent conditions. The partial zmet2b methyltransferase nucleic acid 
5 sequence described herein encodes a ponion of zmet2b methyltransferase. The partial 
amino acid sequence of zmei2b methyltransferase is shown in FIG. 24. The zmet2b 
methyltransferase nucleic acid sequence can be used in recombinant expression 
cassettes in the same manner as the isolated and purified zmet2a nucleic acid 
sequence described herein. Such recombinant expression cassettes can be used to 
10 create transgenic plants containing these recombinant expression cassettes. 

Additionally, the zmet2b methyltransferase nucleic acid sequence can be used to 
reduce or alter methyltransferase activity in transgenic plants in the same manner as 
the zmet2a methyltransferase nucleic acid sequence. 

15 Definitions 

Units, prefixes, and symbols can be denoted in the SI accepted form. Numeric 
ranges are inclusive of the numbers defining the range. Unless otherwise indicated, 
nucleic acids are written left to right in 5' to 3' orientation, respectively. The 
headings provided herein are not limitations of the various aspects or embodiments of 
20 the invendon which can be had by reference to the specification as a whole. 
Accordingly, the terms defined immediately below are more fully defined by 
reference to the specification as a whole. 

As used herein, the term "plant" includes reference to whole plants, plant 
25 organs (e.g., leaves, stems, roots, etc.), seeds and plant cells and progeny thereof. The 
class of plants which can be used in the methods of the present invention are generally 
as broad as the class of higher plants amenable to transformation techniques, 
including both monocotyledonous and dicotyledonous plants. 

30 As used herein, "heterologous" when used to describe nucleic acids or 

polypeptides refers to nucleic acids or polypeptides that originate from a foreign 
species, or, if from the same species, are substantially modified from their original 
form. For example, a promoter operably linked to a heterologous structural gene is 
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from a species different from that from which the structural gene was derived, or. if 
from the same species, one or both are substantially modified from their original 
form. 

5 A nucleic acid or polypeptide is "exogenous to" an individual plant is one 

which is introduced into the plant by any means other than by a sexual cross. 
Examples of means by which this can be accomplished are described below, and 
include Agrohacterium-mcdiated transformation, biolistic methods, electroporation, 
and the like. Such a plant containing the exogenous nucleic acid is referred to herein 
10 as an Ri generation transgenic plant. Transgenic plants which arise from sexual cross 
or by selfing are descendants of such a plant. 

As used herein. "'zmet2a methyltransferase gene" or ■■zmet2a 
methyltransferase nucleic acid" refers to a nucleic acid encoding zmet2a 
methyltransferase and which hybridizes under stringent conditions and/or has at least 
60% sequence identity at the deduced amino acid level to the exemplified sequences 
provided herein. The zmet2a polypeptide encoded by the zmet2a methyltransferase 
gene has at least 55% or 60% sequence identity, typically at least 65?/o sequence 
identity, preferably at least 70% sequence idemity, often at least 75% sequence 
identity, more preferably at least 80% sequence identity, and most preferably at least 
90% sequence identity at the deduced amino acid level relative to the exemplary 
zmet2a methyltransferase sequences provided herein. 

As used herein, '■zmet2a methyltransferase nucleic acid" includes reference to 
25 a contiguous sequence from a zmet2a methyltransferase gene of at least 2454 

nucleotides in length. In some embodiments the nucleic acid is preferably at least 
2736 nucleotides in length (see FIG. lA) and more preferably at least 2796 
nucleotides in length (see FIG. IB). 

30 As used herein, "zmet2b methyltransferase gene" or "zmet2b 

methyltransferase nucleic acid" refers to a nucleic acid encoding zmet2b 
methyltransferase and which can be identified using the partial zmet2b 
methyltransferase nucleic acid shown in FIG. 23. The zmet2b methyltransferase gene 

8 



m 

ill 
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hybridizes under stringent conditions to the panial zmet2b methyitransferase nucleic 
acid shown in FIG. 23. 

As used herein, "a partial zmet2b methyitransferase nucleic acid" includes 
5 reference to a contiguous sequence of at least 1181 nucleotides in length and which is 
from the zmet2b methyitransferase gene. 

As used herein, "'isolated" includes reference to material which is substantially 
or essentially free from components which nomially accompany or interact with it as 
10 found in its naturally occumng environment. The isolated material optionally 
comprises material not found with the material in its natural environment. 

.As used herein, "nucleic acid" includes reference to a deoxyribonucleotide or 
ribonucleotide polymer in either single- or double-stranded form, and unless 
otherwise limited, encompasses known analogues of natural nucleotides that hybridize 
to nucleic acids in a manner similar to naturally occurring nucleotides. Unless 
otherwise indicated, a particular nucleic acid sequence includes the complementary 
sequence thereof. 

As used herein, "operably linked" includes reference to a functional linkage 
between a promoter and a second sequence, wherein the promoter sequence initiates 
and mediates transcription of the DNA sequence coiresponding to the second 
sequence. Generally, operably linked means that the nucleic acid sequences being 
linked are contiguous and, where necessary to joint two protein coding regions, 
contiguous and in the same reading frame. 

In the expression of transgenes, one of ordinary skill in the art will recognize 
that the inserted nucleic acid sequence need not be identical and may be "substantially 
identical" to a sequence of the gene from which it was derived. As explained below, 
30 these variants are specifically covered by this term. 

In the case where the inserted nucleic acid sequence is transcribed and 
translated to produce a functional 2met2a andVor zmet2b methyitransferase 
9 




wo 00/53732 



PCT/USOO/06456 



polypeptide, one of ordinary skill in the an will recognize that because of codon 
degeneracy, a number of nucleic acid sequences will encode the same polypeptide. 
These variants are specifically covered by the term "zmet2a methyltransferase nucleic 
acid sequence" or ■■zmet2b methyltransferase nucleic acid sequence". In addition, the 
5 term specifically includes those full length sequences substantially identical 

(determined as described below) with a zmet2a and/or zmet 2b methyltransferase 
gene sequence which encode proteins that retain the function of the zmet2a and/or 
zmet2b methyltransferase. Thus, in the case of the zmet2a and/or zmet2b 
methyltransferase genes described herein, the term includes variant nucleic acid 
10 sequences which have substantial identity with the sequences disclosed herein and 
which encode proteins capable of reducing or regulating DNA methyiation in a 
transgenic plant for various purposes as well as silencing target genes in a plant using 
the nucleic acid sequences described herein. 

15 Two nucleic acids or polypeptides are said to be "identicar' if the sequence of 

nucleotides or amino acid residues, respectively, in the two sequences is the same 
when aligned for maximum correspondence as described below. The term 
"complementary to" is used herein to mean that the complementary sequence is 
identical to all or a specified contiguous portion of a reference nucleic acid sequence. 

20 Sequence comparisons between two (or more) nucleic acids or polypeptides are 

typically performed by comparing sequences of two optimally aligned sequences over 
a segment or "comparison window" to identify and compare local regions of sequence 
similarity. Optimal alignment of sequences for comparison may be conducted by the 
local homology algorithm of Smith and Waterman, App. Math. 2: 482 (1981), by 

25 the homology alignment algorithm of Neddleman and Wunsch, J. Mol. Biol. 48:443 
(1970), by the search for similarity method of Pearson and Lipman. Proc. Natl. Acad. 
Sci. (U.S.A.) 85:2444 (1988), by computerized implementation of these algorithms 
(GAP, BESTFIT. FASTA, and TFASTA in the Wisconsin Genetics Software 
Package, Genetics Computer Group (hereinafter "GCG"), 575 Science Dr., Madison. 

30 WI), or by inspection. 



"Percentage of sequence identity" is determined by comparing two optimally 
aligned sequences over a comparison window, where the portion of the nucleic acid 
JO 
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sequence in the comparison window may comprise additions or deletions (i.e., gaps) 
as compared to the reference sequence (which does not comprise additions or 
deletions) for optimal alignment of the two sequences. The percentage is calculated 
by determining the number of positions at which the identical nucleic acid base or 
5 amino acid residue occurs in both sequences to yield the number of matched 

positions, dividing the number of matched positions by the total number of positions 
in the wmdow of comparison and multiplying the result by 100 to yield the percentage 
of sequence identity. 

10 The term "substantial identity" of nucleic acid sequences means that a nucleic 

acid comprises a sequence that has at least 55% or 60% sequence identity, generally 
at least 65%, preferably at least 70%, often at least 75%, more preferably at least 80%o 
and most preferably at least 90%. compared to a reference sequence using the 
programs described above (preferably BESTFIT) using standard parameters. One of 

15 ordinary skill in the art will recognize that these values can be appropriately adjusted 
to determine corresponding identity of proteins encoded by two nucleotide sequences 
by taking into account codon degeneracy, amino acid sequences for those purposes 
normally means sequence identity of at least 55% or 60%, preferably at least 70%, 
more preferably at least 80%, and most preferably at least 95%o. Polypeptides having 

20 "'sequence similarity" share sequences as noted above except that residue positions 
which are not identical may differ by conservative amino acid changes. Conservative 
amino acid substitutions refer to the interchangeability of residues having similar side 
chains. For example, a group of amino acids having aliphatic side chains is glycine, 
alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic- 

25 hydroxyl side chains is serine and threonine; a group of amino acids having aromatic 
side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having 
basic side chains is lysine, arginine, and histidine; and a group of amino acids having 
sulfur-containing side chains is cysteine and methionine. Preferred conservative 
amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine- 

30 tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine. 

Another indication that nucleic acid sequences are substantially identical is if 
two molecules hybridize to each other under appropriate conditions. Appropriate 
1 1 



wo 00/53732 



PCT/USOO/06456 



conditions can be high or low stringency and will be different in different 
circumstances. Generally, stringent conditions are selected to be about 5°C to about 
20°C lower than the thermal melting point (Tm) for the specific sequence at a defined 
ionic strength and pH. The T,,, is the temperature (under defined ionic strength and 
5 pH 0) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Typically, stringent wash conditions are those in which the salt concentration is about 
0.22 molar at pH 7 and the temperature is at least about 50°C. However, nucleic acids 
which do not hybridize to each other under stringent conditions are still substantially 
identical if the polypeptides which they encode are substantially identical. This may 
10 occur, e.g., when a copy of a nucleic acid is created using the maximum codon 
degeneracy permitted by the genetic code. 

Nucleic acids of the present invention can be identified from a cDNA or 
genomic library prepared according to standard procedures and the nucleic acids 

15 disclosed here used as a probe. For example, stringent hybridization conditions will 
typically include at least one low stringency wash using 0.3 molar salt (e.g., 2X SSC) 
at 65°C. The washes are preferably followed by one or more subsequent washes 
using 0.03 molar salt (e.g., 0.2X SSC) at 50°C, usually 60°C, or more usually 65X. 
Nucleic acid probes used to isolate the nucleic acids are preferably at least 100 

20 nucleotides in length. 

As used herein, a homologue of a particular zmetla and/or zmet2b 
methyltransferase gene is a second gene (either in the same species or in a different 
species) which encodes a protein having an amino acid sequence having at least 50% 
25 identity or 75% similarity to (determined as described above) to a polypeptide 
sequence in the first gene product. 

As used herein, '"nucleotide binding site" or "nucleotide binding domain" 
includes reference to a region consisting of kinase-la, kinase 2, and kinase 3a motifs, 
30 which participates in ATP/GTP-binding. Such motifs are described for instance in Yu 
etal.,Proc. Acad. Sci. USA 93:11751-11756(1996); Mindrinos. et ai. Ce// 78:1089- 
1099 and Shen et ai. FEBS, 335:380-385 (1993). 
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As used herein, "tissue-specific promoter" includes reference to a promoter in 
which expression of an operabiy linked gene is limited to a particular tissue or tissues. 

As used herein "recombinant" includes reference to a cell, or nucleic acid, or 
vector, that has been modified by the introduction of a heterologous nucleic acid or 
the alteration of a native nucleic acid to a form not native to that cell, or that the cell is 
derived from a cell so modified. For example, recombinant cells express genes that 
are not found within the native (non-recombinant) fomi of the cell or express native 
genes that are otherwise abnormally expressed, under expressed or not expressed at 



-As used herein, a "recombinant expression cassette" is a nucleic acid 
construct, generated recombinantly or synthetically, with a series of specified nucleic 
acid elements which permit transcription of a particular nucleic acid in a target cell. 
15 The expression vector can be pan of a piasmid, virus, or nucleic acid fragment. 
Typically, the recombinant expression cassette portion of the expression vector 
includes a nucleic acid to be transcribed, and a promoter. 

As used herein, "transgenic plant" includes reference to a plant modified by 
20 introduction of a heterologous nucleic acid. Generally, the heterologous nucleic acid 
is a zmet2a and/or zmet2b methyltransferase structural or regulatory gene or 
subsequences or combinations thereof 



As used herein, "hybridization complex" includes reference to a duplex 
25 nucleic acid sequence formed by selective hybridization of two single-stranded 
nucleic acids with each other. 



As used herein, "amplified" includes reference to an increase in the molarity 
of a specified sequence. Amplification methods include the polymerase chain 
30 reaction (hereinafter "PCR"), the ligase chain reaction (hereinafter "LCR"), the 
transcription-based amplification system (hereinafter "TAS"). the self-sustained 
sequence replication system (hereinafter "SSR"). A wide variety of cloning methods. 



wo 00/53732 



PCT/USOO/06456 



host cells, and in vitro amplification methodologies are well-known to persons of 
ordinary skill in the art. 

As used herein, "nucleic acid sample" includes reference to a specimen 
5 suspected of comprising a zmet2a and/or zmet2b methyltransferase gene. 

SEQUENCE LISTINGS 
The present application contains a number of nucleotide sequences and amino 
acid sequences. For the nucleotide sequences, the base pairs are represented by the 
10 following base codes: 



Symbol 



Meaning 
A: adenine 
C; cytosine 
G; guanine 
T; thymine 
U; uracil 
AorC 
A or G 
A or T/U 
CorG 



20 




Symbol 



Meaning 



25 



30 



Y 
K 
V 
H 
D 
B 
N 



A or C or G: not T/U 
A or C or T/U; not G 
A or G or T/U; not C 
C or G or TTJ; not A 
(A or C or G or T/U) 



Cor T/U 
G or T/U 



The amino 



acids shown in the application are in the L-form and are 
following amino acid-three letter abbreviations: 



represented by the 



■Abbreviation 

Ala 

Arg 



Amino acid name 
L-Alanine 
L-Arginine 
L-Asparagine 
L-Aspartic Acid 
L-Aspartic Acid or Asparagine 
L-Cysteine 
L-Glutamic Acid 



40 



Asn 
Asp 
Asx 



Cys 



Glu 



14 
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Gin 


L-Gluiamine 


Glx 


L-Glutamine or Glutamic Acid 


Gly 


L-Glycine 


His 


L-Histidine 


He 


L-lsoleucine 


Leu 


L-Leucine 


Lys 


L-Lysine 


Met 


L-Methionine 


Phe 


L-Phenylalanine 


Pro 


L-Proline 


Ser 


L-Serine 


Thr 


L-Threonine 


Trp 


L-Tryptophan 


Tyr 


L-Tyrosine 


Val 


L-Valine 


Xaa 


L-Unknown or other 



BRIEF DESCRIPTION OF THE DRAWINGS 



20 ^ PIG. 1 A shows the nucleic acid sequence of the zmet2a methyltransferase 

^ \ 

gene contaming 2736 basepairs. FIG. IB shows the nucleic acid sequence of the 
zmet2a methyltransferase gene containing 2796 basepairs. 

FIG. 2A ahows the amino acid sequence of the zmet2a methyltransferase 
25 containing 912 amino acids and which is encoded by the nucleic acid sequence shown 
in FIG. 1 A. FIG. 2m shows the amino acid sequence of the zmet2a methyltransferase 
containing 932 aminwacids and which is encoded by the nucleic acid sequence shown 
in FIG. IB. \ 

30 FIG. 3 shows the I^R primers used to sequence the zmet2a methyltransferase 



FIG. 4 is a summary of the major classes of 5-cytosine methyltransferases 
from mammals, Arabidopsis and maize. 



UG. 5 shows the genomic sequence of zmet2a methyltransferase gene and the 
retrotranspWon SPRITE- 1 , along with intron-exon divisions, a restriction site map 
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FIG. 6 lists the World Wide Web sites used to process the sequence data for 
the zmet2a niethyltransferase gene. 



5 FIG. 7 shows a Southern blot of B73 DNA digested with HindWl and probed 

with clone CGET064. The Southern blot shows the presence of multiple copies of 
zmet2a or zmet2a-like genes in the B73 genome. DNA from B73 was digested with 
Hindlll and probed with clone CGET064 which does not contain a Hindlll site. The 
gene cloned and sequenced is represented by the upper band. 

10 

P^. 8 shows the alignment of the amino acid sequence from 2met2a with the 
minp acid\equence of Arbadiopsis chromosmethylase CMTl (AF039367) and the 
C-terminal mMiylase domains from the DNA methyltransferases of maize zmetl 
(AF063403) a^^rabidopsis MET] (P34881). Zmet2a shows similarity along the 
15 entire length ofCMTI but significant similarity with zmetl and Metl exists only in 
the conserved motim Bold, uppercase, normal uppercase letters, and lower case 
letters indicate identity, conservation, and differences in amino acid sequences 
relative to zmet2a respectively. Dashes in the sequences are gaps introduced by 
CLUSTAL W to optimize the alignments. The location of the six conserved 
20 methylase motifs are indiifated in the sequence. The chromodomain is located 

upstream and adjacent to rrtotif IV. The Mu insertion into the coding region of motif 
IX alters zmet2a function reciting in decreased methylation at CpNpG sites. Putative 
nuclear loalization signal pep|des, NLS (N. Raikhel, Plant Physiol. 100, 1627 (1992)) 
are positioned in the N-terminal portion of the protein. 

25 

FIG. 9 lists the putative identification of zmet2a amino acids involved in 
catalysis by comparison with amino acids of M.HhaJ with known catalytic functions. 
The amino acids of M.Hhai with catalytic functions were determined by 
crystalography by Cheng et al., Cell, 74:299-307 ( 1993). Amino acid of ziTiet2a are 
30 numbered as in Figure 7. 



FIG. 10 shows southern analysis of repetitive DNA methylation patterns. 
Total genomic DNA {5 ug per lane) from an F4 derived F5 family segregating for 
16 
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zmet2a;M// was digested with isoschizomers HpaW and Msp\ wliich recoginze tiie 
sequence CCGG. Digested DNA ws eiectoplioresed through 0.8% agarose, 
transferred to nylon membrane, and hybridized with probes for repetitive DNA; the 
9kb 26s-5.8s-17s ribosomal repeat (FIG. lOA), 5s ribosomal repeat (FIG. lOB), and a 
i centromeric repeat pSau3a9 (FIG. IOC). Decreased methylaiion is observed in 
mutant plants (- -) relative to nonmutant plants (-^ ^) digested with Msp\ which is 
sensitive to methyiation at "''^CpCpG sequnces. No changes in melhylation patterns at 
"""CpG sits are observed in mutant plants as indicated by the lack of digestion with 
Hpall. Plants heterozygous for zmet2a:Mz// (+ - ) also show decreases at '"'^CpCpg 
10 sites. 

FIG. 1 1 shows gels from a Southern analysis which demonstrate that plants 
homozygous for zniet2a::MuI have decreased methyiation at CpNpG sites. More 
sites cut with restriction enzymes that are sensitive to methyiation at CpNpG sites in 

15 zmel2a:Mul plants. EcoRll is sensitive to methyiation at CC*A/TGG sites where * 
indicates the senitive cytosine (FIG. 11 A). Bglll is sensitive to methyiation at 
AGATC*T sites (FIG. 1 IB). Psfl is sensitive to methyiation at C*TGCAG sites 
(FIG. IIC). 5amHI is sensitive to methyiation at GGATC*C sites (FIG. 11 D). Avail 
is sensitive to methyiation at GGA/TC*C sites (FIG. 1 IE). Changes at CpG sites 

20 cannot be sparated from CpCpG in the ^vall digests. DNA from the same plants as 
those in Figure 1 0 were digested and hybridized with the repetitive probes as 
described herein. 

FIG. 12 shows the cytosine methyiation levels in an F4 derived F5 segregating 
25 line for zmet2a::Mul. 5-methylcytosine content of DNA extracted from tissue of 

immature 5'"^ -7'*^ leaves was determined by reverse phase HPLC using the method of 
Gehrike et al. Values were obtained from three wildtype plants, seven heterozygous 
plants and five homozygous plants. Two samples were run for each plant. 
Percentages of 5mC content [5mC/(5mC + C)] were calculated from concentrations 
30 determined from integration of peak and comparison to known standards. 



FIG. 13 shows gels from a Southern analysis which demonstrate that plants 
homozygous for zmet2a::Mul having a reduced level of methyiation that is stable 
17 
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over generations. Two Ft derived F3 families iiomozygous for zmet2a:Mul. B5 and 
B6, were self pollinated to the F6 generation. Two lineages from B5 and three 
lineages from 84 were grown at the University of Wisconsin. West Madison 
Agronomy Farm in 1999. Methylation levels are consistent across generations. Once 
5 zmetla.Mul is in a homozygous state, methylation is reduced to a specific level and 
no further reductions occur. Dilution of methylation is not observed in each 
successive generation. DNA from leaf tissue was digested with Mspl and the 
Southern blot was hybridized with 9kb ribosomal repetitive probe. 

10 FIG. 14 shows gels from a Southern analysis which demonstrate that 

methylation levels are restored to nonmutant parental levels in backcross progeny 
homozygous for wildtype zmet2a. An Fl hybrid of an F4 line homozygous for 
zmet2a::A/i// (lanes 1-3) and the inbred line Mol7 (lanes 4-6) was backcrossed to the 
nonmutant Mo 17 parent ot generate plants homozygous wildtype and plants 

15 heterozygous for zmet2a:Mz//. Fl plants (lanes 7-11) have methylation levels 

intermediate those of the parents. BC 1 progeny heterozygous for zmet2a;Mw/ (lanes 
12-17) have methylation levels similar to the Fl. BCl plants restored to wild-type 
zmet2a (lanes 18-21 ) have remethylation to levels comparable to the nonmutant 
parent line. Complete or near complete remethylation has occurred within one sexual 

20 generation. DNA was extracted from the 4"* - 6"^ immature leaves of greenhouse 
grown seedlings, digested with Pstl which is sensitive to methylation at "^^CTGCAG 
sequences, and hybridized to the pSau3a9 centromeric repeat. 

FIG. 15 shows gels from a Southern analysis which demonstrate the 
25 expression of zmet2a in different tissues during development. Southern blots were 
produced with cDNA's synthesized from mRN.A. extracted from embryos 24 days 
after pollination (hereinafter "DAP"), young leaves, immature ear, immature tassel, 
BMS callus, and 10 day old seedlings. Figure 15A shows the ethidium bromide 
stained gel. Ail lanes were loaded with 750 ng of cDNA except for the 10 day 
30 seedlings, of which 280 ng was loaded due to the limited amount available. The 

cDNA's were quantified by spectrophotometry. The marker lane contains 800 ng of 
lambda DNA digested with HindlU. Figure 15B shows the Southern blot hybridized 
with the zmet2a cDNA probe. Hybridization is observed in tissues that are actively 
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undergoing ceil division. Figure 15C shows the same blot hybridized to a ubiquitin 
probe to show cDNA loading variation. 

UG. 16 shows the structure of maize retrotransposon SPRITE- 1 and sequence 
5 of LUng Terminal Repeat (hereinafter "LTR") components. FIG. 1 6A shows that 
SPRJTE-1 ck)nsists of long terminal direct repeals, a tRNA primer binding site 
(hereinafter "^BS"). coding sequence for proteins necessai-y for replication and 
transposition, and a polypurine tract (hereinafter "PPT"). FIG. 16B identifies the 
sequences for th^5" and 3' LTR , PBS and PPT. Each LTR has a 3 base pair inverted 
10 repeat which is als^ shown in the drawing. A putative TATA box is underlined and 
the putative transcrrotion start site is italicized. The 5 base pair host insertion site 
duplications are alsoVidentified. 

FIG. 1 7 shows «he alignments of the conser\'ed protein motifs of the Tyl/copia 
15 elements with SPRITEAl. The maize retrotransposon SPRITE- 1 is aligned with the 
retrotransposon hopscotdh (U2626) from maizze, retrofit (U72725) from rice, an 
unpublished ^ra6/i^op5/5'|etrotransposon (AC006528) and the copia element from 
Drosophila(M11240). \ 



20 FIG. 18 shows that the SPRITE- 1 copy number and insertion sites differ 

among maize inbred lines. DNA (7 p-g) from inbred maize lines, barley, ice. rye, 
wheat, and potato was digested with BcoRI which does not cut within the 
retroelement sequence. The Southern blot was hybridized with a 950 bp SPRITE- 1 
fragment which includes the 5' untranslated sequence and 5' sequence putati\ ely 

25 coding for the gag protein but does not include the conserved gag motif or the 5" 
terminal repeat. 

FIG. 19 shows the identification of inbred lines containing a SPRITE- 1 
insertion in zmet2a. PCR was conducted on maize inbred lines from various origins 
30 using a primer upstream of the SPRITE- 1 insertion site 15F in conjuction with a 
SPRITE-1 specific primer 18R or a zmet2a primer downstream of the element 8R. 
The upper panel (15F/18R) show the inbreds that do not have a SPRITE-1 insertion. 
The lower panel ( 15F/18R) shows that Moll 7 and A682 have a SPRITE-1 insertion 
19 
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into zmet2a. A682 has an amplification product from both primer sets indicating that 
it may be hemizygous for SPRITE- 1. 

FIG. 20 shows expression of retroelement SPRITE- 1. Figure 20A shows a 
? Southern blot of cDNAs from roots, immature embryo 24 days after pollination 
(hereinafter, "DAP"), young leaf, young leaf with inacive zmet2a immature ear, 
immature tasseJ. mature pollen. Black Mexican Sweet (hereinafter, "BMS") callus, 
and 10 day seedling, hubridized with a SPRITE-1 probe. Transcription of SPRITE-1 
is evident as indicated by the hybridization to cDNA from embryo, and leaf tissue. 
0 Expression is highest in leaf tissue with significantly more expression being observed 
in leaf tissue from zmet2a:Mul plants that have decreased CpNpG methylation. FIG. 
20B shows the same Southern blot hybridized to a ubiquitin probe as a loading 
control. 

FIG. 21 shows thai the presence of a SPRITE- 1 insertion into a zmet2a intron 
does not alter transcript splicing. Fragmetns spanning the SPRITE- 1 insertion and 
downstream from the insertion site were amplified by PGR from cDNA's. FIG. 21 A 
shows a scaled representation of zmei2a. Exons are represented by large blocks while 
the intervening introns are depicted by lines. The insenion of the retroelement is 
indicated above the zmet2a diagram. The element is inserted in the opposite 
orientation relative to zmet2a as indicated by the boxed arrows which represent the 
direct repeats. Positions of the primers used to generate fragments are indicated 
below the zmet2a diagram. Fragments were amplified from B73 (FIG. 2 IB) 
immature ear cDNA which does not contain the retroelement insertion and Mol 7 (M) 
embryo 24 days after pollination cDNA (FIG. 2 IB) and Mo 17 (M) 10 day seedling 
cDNA (FIG. 21C). No differences were observed on the ethidium bromide stained 
gel of the PGR products. FIGS. 21B and 21C show hybridization of a near full length 
B73 cDNA probe to a Southern blot of the PGR fragments. 

FIG. 22 shows the methylation status of SPRITE-1. DNA from immature 
leaves was digested with methylation sensitive restriction enzymes. Southern blots 
were hybridized with a 970 base pair fragment from the 5' end of the untranslated 
region of SPRITE-1. There are 5 BstNI/EcoRlI sites, 1 Mspl/Hpall sites and 1 PstI 

20 
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site within tiie sequence context of this probe. Nearly all sites are methylated in this 
region. 

IG. 23 shows a partial nucleic acid sequence of the zmet2b methyltransferase 
5 gene. \ 

FIG. 24\Shows a partial amino acid sequence of the zmet2b methyltransferase 
encoded by the partial nucleic acid sequence shown in FIG. 23. 

iO FIG. 25 shov\xk a comparison of a ponion of the amino acid sequence for 

zmet2a methyltransferase with a portion of the amino acid sequence for zmet2b 
methyltransferase. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
1 5 In one embodiment, the present invention relates to a zmet2a 

methyltransferase gene. The zmet2a methyltransferase gene of the present invention 
encodes a class II methyltransferase gene which controls CpNpG methylation. 
Nucleic acid sequences from the zmet2a methyltransferase gene described herein can 
be used to reduce or to alter the level of DNA methylation in a plant. In addition, the 
20 zmet2a nucleic acid sequence described herein can be used to methylate a targeted 
gene in a plant in vivo to "silence" or "knock-out" said gene. 

In another embodiment, the present invention relates a zmet2b 
methyltransferase gene. The zmet2b methyltransferase gene can be isolated using a 

25 partial zmet2b methyltransferase gene described herein. Like the zmet2a 

methyltransferase gene, the zmet2b methyltransferase gene encodes a class II 
methyltransferase gene which controls CpNpG methylation. Nucleic acid sequences 
encoding the zmet2b methyltransferase gene can be used in the same manner as the 
nucleic acid sequence encoding the zmet2a methyltransferase gene to reduce or to 

30 alter the level of DNA methylation in a plant. In addition, the zmet2b nucleic acid 
sequence can be used to methylate a targeted gene in a plant in vivo to "silence" or 
"knock-out" said gene. 
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The present invention is applicable to a broad range of types of 
monocotviedonous and dicotyledonous plants, including, but not limited to, Zea mays, 
On'za saliva. Secale cereale, Triiicwn aesiivuin. Daiiciis caroia, Brassica oleracea, 
Cucwnis melo. Cucumis sativiis, Laiuca saliva. Solamtm lubersoum, Lycopersicon 
5 esculenium. Phaseolus vulgaris, and Brassica napus. 

The nucleic acids of the present invention can be used in marker-aided 
selection. Marker-aided selection does not require the complete sequence of the gene 
or precise knowledge of which sequence confers which specificity. Instead, partial 

10 sequences can be used as hybridization probes or as the basis for oligonucleotide 

primers to amplify by PCR or other methods to follow the segregation of chromosome 
segments containing the zmet2a and/or zmet2b methyltransferase gene(s) in plants. 
Because the zniet2a or zmei2b methyltransferase marker is the gene itself, there can 
be negligible recombination between the marker and the methylated phenotype. 

15 Thus, the nucleic acids of the present invention can be used to provide an optima! 

means to DNA fingerprint class II DNA methyltransferases in other cultivars and wild 
germplasm. This can be used to indicate if other germplasm accessions and cultivars 
carry the same zmet2a andy'or zmet2b methyltransferase genes. 

20 Preparation of the Nucleic acids of the Present Invention 

Generally, the nomenclature and the laboratory procedures involved with 
recombinant DNA technology described below are those well known and commonly 
emploved by those of ordinary skill in the art. Standard techniques are used for 
cloning, DNA and RNA isolation, amplification and purification. Generally, 

25 enzymatic reactions involving DNA ligase. DNA polymerase, restriction 
endonucleases and the like are performed according to the manufacturer's 
specifications. These techniques and various other techniques are generally 
performed according to Sambrook ei ai. Molecular Cloning - A Laboratory Manual, 
Cold Sprmg Harbor Laboratory, Cold Spring Harbor, New York (1989). 



The isolation of zmet2a and/or zmet2b methyltransferase gene(s) can be 
accomplished via a number of techniques. For instance, oligonucleotide probes based 
on the sequences disclosed herein can be used to identify the desired gene in a cDNA 
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or genomic DNA library. To construct genomic libraries, large segments of genomic 
DNA are generated by random fragmentation, e.g. using restriction endonucleases, 
and are ligated with vector DNA to form concaiemers that can be packaged into the 
appropriate vector. To prepare a cDNA library. mRNA is isolated from the desired 
5 organ of a particular plant, such as shoots from lea mays, and a cDNA library which 
contains the zmet2a or zmet2b methyltransferase gene transcript is prepared from the 
mRNA. Alternatively, cDNA may be prepared from mRNA extracted from other 
tissues in which the 2met2a or zmet2b methyltransferase gene or homologs are 
expressed. 

0 

The cDNA or genomic library can then be screened using a probe based upon 
the sequence of a cloned zmet2a and/or zmet2b methyltransferase gene or partial 
sequence from either thereof (such as the partial zmet2b methyltransferase nucleic 
acid sequence shown in FIG. 23). Probes may be used to hybridize with genomic 
5 DNA or cDNA sequences to isolate homologous genes in the same or different plant 
species. 

Those of ordinary skill in the art will appreciate that various degrees of 
stringency of hybridization can be employed in the assay; and either the hybridization 

0 or the wash medium can be stringent. As the conditions for hybridization become 
more stringent, there is a greater degree of complementarity required between the 
probe and the target for duplex formation to occur. The degree of strmgency can be 
controlled by temperature, ionic strength, pH and the presence of a partially 
denaturing solvent such as formamide. For example, the stringency of hybridization 

5 is conveniently varied by changing the polarity of the reactant solution through 
manipulation of the concentration of formamide within the range of 0% to 50%. 

Alternatively, the nucleic acids of interest can be amplified from nucleic acid 
samples using amplification techniques. For instance, polymerase chain reacdon 
0 (hereinafter "PCR") technology can be used to amplify the sequences of the zmet2a 
and/or zmet2b methyltransferase and related genes directly from genomic DNA, from 
cDNA, from genomic libraries or from cDNA libraries. PCR and other in viiro 
amplification methods may also be useful, for example, to clone nucleic acid 
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sequences that code for proteins to be expressed, to make nucleic acids to use as 
probes for detecting the presence of the desired mRNA in samples, for nucleic acid 
sequencing, or for other purposes. 

5 The degree of complementarity (sequence identity) required for detectable 

binding will vary in accordance with the stringency of the hybridization medium 
and/or wash medium. The degree of complementarity will optimally be 100 percent; 
however, it should be understood that minor sequence variations in the probes and 
primers may be compensated for by reducing the stringency of the hybridization 
10 and/or wash medium as described earlier. 

Appropriate primers and probes for identifying zmel2a and/or zmet2b 
meihyltransferase nucleic acid sequences from plant tissues are generated from a 
comparison of the sequences provided herein. For a general over\'iew of PGR see 
15 PGR Protocols: A Guide to Meihods and Applications. (Innis, M, Gelfand, D., 

Snisky, J. and White, T., eds). Academic Press, San Diego (1990), incorporated herein 
by reference. 

Nucleic acids may also be synthesized by well-known techniques as described 
20 in the technical literature. See e.g., Curruthers ^/.. Cold Spring Harbor Symp. 

Quant. Biol. 47:41 1-418 (1982). and Adams el ai, J. Am. Chem. Soc. 105:661 (1983). 
Double stranded DNA fragments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate 
conditions, or by adding the complementary strand using DNA polymerase with an 
25 appropriate primer sequence. 

Proteins of the Present Invention 

The present invention further provides for isolated zmet2a and/or zmet2b 
methyltransferases encoded by the zmet2a and/or zmet2b methyltransferase nucleic 
30 acids disclosed herein. One of ordinary skill in the art will recognize that nucleic 
acids encoding a functional zmet2a or zmet2b methyltransferase need not have a 
sequence identical to the exemplified genes disclosed herein. For example, because 
of codon degeneracy, a large number of nucleic acid sequences can encode the same 
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polypeptide. In addition, the polypeptides encoded by the zmet2a and/or zmet2b 
methyUransferase genes, like other proteins, have different domains which perfoim 
different iiinctions. Specifically, zmet2a methyltransferase has ten (10) domains. 
These ten domains are identified as follows; I, chromodomain p2, chromodomain P3, 
IV. VI. VIII, IX and X. The ten domains and their sequence ranges (as shown in SEQ 
ID NO:2) are listed below in Table 1 : 

TABLE 1 

Domain Amino Acid Sequence Range 



I 


244-271 


Chromodomain P2 


366-379 


Chromodomain [33 


380-388 


IV 


411-434 


VI 


456-476 


VIII 


496-520 


IX 


723-746 


X 


751-775 



m 

Domains I and X are involved in binding AdoMet. which is source of the 
m 20 methyl group to be transferred during DNA methylation. Domain IV contains a 
;;|'- catalytic domain. Domain VI aids in the positioning of domain IV. Domain VIII aids 

in DNA binding by neutralizing the charge of the phosphodiester backbone. The 
region between domain VIII and domain IX defines the sequence specificity of the 
zmet2a methyltransferase enzyme. Thus, the zmet2a methyltransferase gene 
25 sequences need not be full length, so long as the desired functional domain of the 
protein is expressed. 

The zmet2a methyltransferase protein is at least 912 amino acid residues in 
length (see FIG. 2A), preferably, 932 amino acid residues in length (see FIG. 2B). 
30 However, those of ordinary skill in the art will appreciate that amino acid deletions, 
substitutions, or additions to the zmet2a methyltransferase protein will typically yield 
a enzyme possessing methylating characteristics similar or identical to that of the full 
length sequence. Thus, full length zmet2a methyltransferase proteins modified by 1, 
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2. 3, 4, or 5 deletions, substitutions, or additions, generally provide an effective 
degree of meihylation relative to the ilill-length protein. 

A panial amino acid sequence of the zmet2b methyltransferase protein is 
5 provided for in FIG. 24 and is 256 amino acids in length. 

Modified protein chains can also be readily designed utilizing various 
recombinant DNA techniques well known to those of ordinary skill in the art. For 
example, the chains can vary from the naturally occurring sequence at the primary 
10 structure level by amino acid substitutions, additions, deletions, and the like. 

Modification can also include swapping domains from the protems of the present 
invention with related domains from other class II methyltransferases. 

The present invention also provides antibodies which specifically react with 
15 the zmet2a and/or zmet2b methyltransferase(s) of the present invention under 

immunologically reactive conditions. An antibody immunologically reactive with a 
particular antigen can be generated in vivo or by recombinant methods such as by 
selection of libraries of recombinant antibodies in phage or similar vectors. The term 
"immunologically reactive conditions" as used herein, includes reference to 
20 conditions which allow an antibody, generated to a particular epitope of an antigen, to 
bind to that epitope to a detectably greater degree than the antibody binds to 
substantially all other epitopes, generally at least two times above background 
binding, preferably at least five times above background. Immunologically reactive 
conditions are dependent upon the format of the antibody binding reaction and 
25 typically are those utilized in immunoassay protocols. 

The tenn "antibody" as used herein, includes reference to an immunoglobulin 
molecule obtained by in vitro or vivo generation of the humoral response, and 
includes both polyclonal and monoclonal antibodies. The term also includes 
30 genetically engineered forms such as chimeric antibodies (e.g., humanized murine 
antibodies), heteroconjugate antibodies (e.g.. bispecific antibodies), and recombinant 
single chain Fv fragments (scFv). The term "antibody" also includes antigen binding 
forms of antibodies (e.g., Fab'. F(ab')2, Fab, Fv, and. inverted IgG. See. Pierce 
26 
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Catalog and Handbook. 1994-1995 ) Pierce Chemical Co., Rockford. IL). An 
antibody immunologically reactive with a particular antigen can be generated in vivo 
or by recombinant methods such as selection of libraries of recombinant antibodies in 
phage or similar vectors {See. e.g. Huse el al.. (1989) Science 246: 1275-1281; and 
Ward, ei al.. (1989) Naiiire 341:544-546; and Vaughan et al., (1996) Nature 
Biotechnology, 14:309-314). 

Many methods of making antibodies are known to persons of ordinary skill in 
the art. A number of immunogens are used to produce antibodies specifically reactive 
to the zmet2a and/or zmet2b methyltransferase(s) of the present invention under 
immunologically reactive conditions. An isolated recombinant, synthetic, or native 
zmet2a and/'or zmet2b methyltransferase(s) of the present invention is the preferred 
immunogens (antigen) for the production of monoclonal or polyclonal antibodies. 

The zmet2a and/or zmet2b methyitransferase(s) is then injected into an animal 
capable of producing antibodies. Either monoclonal or polyclonal antibodies can be 
generated for subsequent use in immunoassays to measure the presence and quantity 
of the zmet2a and/or zmet2b methyltransferases. Methods of producing monoclonal 
or polyclonal antibodies are known to those of skill in the art (See, Coligan (1991) 
Current Protocols in Immunology' Wiley/Greene, NY; and Harlow and Lane (1989) 
Antibodies: A Laboratory Manual Cold Spring Harbor Press, NY); Coding (1986) 
Monoclonal Antibodies: Principles and Practice (2d ed.) Academic Press, New 
York, NY). 

Frequently, the zmet2a and/or zmet2b methyltransferase(s) and antibodies will 
be labeled by joining, either covaiently or non-covalently, a substance which provides 
for a detectable signal. A wide variety of labels and conjugation techniques are 
known and are reported extensively in both the scientific and patent literature. 
Suitable labels include radionucleotides. enzymes, substrates, cofactors, inhibitors, 
fluorescent moieties, chemiluminescent moieties, magnetic particles, and the like. 
Patents teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850.752: 
3,939,350; 3,996,345; 4,277.437; 4,275,149; and 4,366.241. 
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The antibodies of tiie present invention can be used to screen plants for the 
expression of the zmet2a and/or zmet2b methyltransferase(s). The antibodies of the 
present invention are also used for affinity chromatography in isolating zmet2a 
and/zmet2b methyltransferase(s). 

The present invention further provides ziTiet2a and/or zmet2b 
methyltransferase polypeptides that specifically bind, under immunologically reactive 
conditions, to an antibody generated against a defined immunogen, such as an 
immunogen consisting of the polypeptides of the present invention. For example, 

10 immunogens will generally be at least 912 contiguous amino acids from the zmet2a 
methyltransferase polypeptide of the present invention. Nucleic acids which encode 
such cross-reactive zmet2a and/or zmet2b methyltransferase polypeptides are also 
provided by the present invention. The zmet2a/'zmet2b methyltransferase 
polypeptides can be isolated from any number of plants as discussed earlier. 

15 Preferred plants are Zea mays, Otyza sativa. Secale cereale, Trilicum aestivum. 
Daucus caroia, Brassica oleracea. Cuciimis melo, Cucuniis saiivus, Latuca sativa. 
Solanum tuber sown, Lycopersicon esculentum, Phaseolus vulgaris, and Brassica 
napus. 

20 As used herein, the term, "specifically binds" includes reference to the 

preferential association of a ligand, in whole or part, with a particular target molecule 
(i.e., "binding partner" or "binding moiety" relative to compositions lacking that 
target molecule). It is, of course, recognized that a certain degree of non-specific 
interaction may occur between a ligand and a non-target molecule. Nevertheless, 

25 specific binding, may be distinguished as mediated through specific recognition of the 
target molecule. Typically, specific binding results in a much stronger association 
between the ligand and the target molecule than between the ligand and non-target 
molecule. Specific binding by an antibody to a protein under such conditions requires 
an antibody that is selected for its specificity for a particular protein. The affinity 

30 constant of the antibody binding site for its cognate monovalent antigen is at least 10^ 
usually at least lO''', more preferably at least 10'°, and most preferably at least lO" 
liters/mole. A variety of immunoassay formats are appropriate for selecting 
antibodies specifically reactive with a particular protein. For example, solid-phase 
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ELISA immunoassays are routinely used lo select monoclonal antibodies specifically 
reactive with a protein (See Harlow and Lane (1988) Antibodies. A Laboratory 
Manual, Cold Spring Harbor Publications. New York, for a description of 
immunoassay formats and conditions that can be used lo determine specific 
5 reactivity). The antibody may be polyclonal but preferably is monoclonal. Generally, 
antibodies cross-reactive to zmet2a andy'or zmet2b methyltransferases are removed by 
immunoabsorbtion. 

Immunoassays in the competitive binding format are typically used for cross- 
10 reactivity determinations. For example, an immunogenic zmet2a and/or zmet2b 

methyltransferase polypeptide is immobilized to a solid support. Polypeptides added 
to the assay compete with the binding of the antisera to the immobilized antigen. The 
ability of the above polypeptides to compete with the binding of the antisera to the 
immobilized zmet2a andyzmet2b methyltransferase polypeptides are compared to the 
15 immunogenic zmet2a and/or zmet2b methyltransferase polypeptide(s). The percent 
cross-reactivity for the above proteins is calculated, using standard calculations. 
Those antisera with less than 10% cross-reactivity with such proteins as zmet2a 
and^or 2met2b methyltransferase(s) are selected and pooled. The cross-reacting 
antibodies are then removed from the pooled antisera by immunoabsorbtion with the 
20 non-zmet2a and/or non-zmet2b methyltransferase poiypeptide(s). 

The immunoabsorbed and pooled antisera are then used in a competitive 
binding immunoassay to compare a second "target" polypeptide to the immunogenic 
polypeptide. In order to make this comparison, the two polypeptides are each assayed 

25 at a wide range of concentrations and the amount of each polypeptide required to 
inhibit 50% of the binding of the antisera to the immobilized protein is determined 
using standard techniques. If the amount of the target polypeptide required is less 
than twice the amount of the immunogenic polypeptide that is required, then the target 
polypeptide is said to specifically bind to an antibody generated to the immunogenic 

30 protein. .A.s a final determination of specificity, the pooled antisera is fully 
immunoabsorbed with the immunogenic polypeptide until no binding to the 
polypeptide used in the immunoabsorbtion is detectable. The fully immunoabsorbed 
antisera is then tested for reactivity with the test polypeptide. If no reactivity is 
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observed, then the test polypeptide is specifically bound by the antisera elicited by the 
immunogenic protein. 

Production of Recombinant Expression Cassettes 
5 Isolated sequences prepared as described herein can then be used to provide 

recombinant expression cassettes. One of ordinary skill in the art will recognize that 
the nucleic acids used in the recombinant expression cassettes described herein 
encoding a functional zmetla and/or zmei2b methyltransferase(s) need not have a 
sequence identical to the exemplified genes disclosed herein, in addition, the 
10 polypeptides encoded by the zmet2a and/or zmet2b methyltransferase genes, like 
other proteins, iiave different domains which perform different functions. Thus, the 
zmet2a andyor zmei2b methyltransferase gene sequences need not be full length, so 
long as the desired functional domain of the protein is expressed. 

15 A DNA sequence coding for the desired zmet2a and/or zmet2b 

methyltransferase polypeptide(s), for example a cDNA or a genomic sequence 
encoding a full length protein, can be used to construct a recombinant expression 
cassette which can be introduced into a desired plant. An expression cassette will 
typically compnse the zmet2a and/or zmet2b methyltransferase nucleic acid(s) 

20 operably linked in either the sense or antisense direction to transcriptional and 

translational initiation regulatory sequences which will direct the transcription of the 
sequence from the zmet2a and/or zmet2b methyltransferase genet's) in the intended 
tissues for the transformed plant. 

25 For example, a plant promoter fragment may be employed which will direct 

expression of the zmet2a and/or zmet2b methyltransferase in all tissues of a 
regenerated plant. Such promoters are referred to herein as "constitutive" promoters 
and are active under most environmental conditions and states of development or cell 
differendation. Examples of consdtutive promoters includes the cauliflower mosaic 

30 virus (CaMV) 35S transcription initiation region, the 1' or 2' - promoter derived from 
T-DNA oi Agrobacteriwn lumefaciens, and ubiquitin other transcription initiation 
resions from various plant genes known to those of ordinary skill in the art. 



30 
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Altemaiively. the plant promoter may direct expression of the zmet2a and/or 
zmet2b methyliransferase gene in a specific tissue or may be otherwise under more 
precise environmental or developmental control. Such promoters are referred to here 
as "inducible" promoters. Examples of environmental conditions that may effect 
5 transcription by inducible promoters include pathogen attack, anaerobic conditions, or 
the presence of light. 

Examples of promoters under developmental control include promoters that 
initiate transcription only in cenain tissues.- such as leaves, roots, fruit, seeds, or 
10 flowers. The operation of a promoter may also vary depending on its location in the 
genome. Thus, an inducible promoter may be fully or partially constitutive in certain 
locations. 



The endogenous promoters from the zmet2a and/or zmet2b methyltransferase 
1 5 genes of the present invention can be used to direct expression of the genes. These 
promoters can also be used to direct expression of heterologous structural genes. The 
promoters can be used, for example, in recombinant expression cassettes to drive 
expression of genes to produce DNA methyltransferase in a particular cell or tissue. 

20 To identify the promoters, the 5 portions of the clones described herein are 

analyzed for sequences characteristic of promoter sequences. For instance, promoter 
sequence elements include the TATA box consensus sequence (TATAAT), which is 
usually 20 to 30 base pairs upstream of the transcription start site. In plants, further 
upstream from the TATA box, at positions -80 to -100, there is typically a promoter 

25 element with a series of adenines surrounding the trinucleotide G (or T) N G. J. 

Messing et al., in Genetic Engineering in Plants, pp. 22 1-227 (Kosage, Meredith and 
Hollaender, eds. 1983). 

If proper polypeptide expression is desired, a polyadenylation region at the 3 - 
30 end of the zmet2a or zmet2b methyltransferase coding region should be included. 
The polyadenylation region can be derived from the natural gene, from a variety of 
other plant genes, or from T-DNA. 
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The vector comprising the sequences from the zmet2a and/or zmei2b 
methyltransferase gene(s) will typically comprise a marker gene which confers a 
selectable phenoiype on plant cells. For example, the marker may encode biocide 
resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, 
5 bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosulforon. 

As discussed above, the zmet2a and/or zmet2b methyltransferase gene(s) can 
be inserted into a recombinant expression cassette in the antisense direction. 
Expression of the zmet2a and/or zmet2b methyltransferase gene(s) in antisense 

10 direction will result in the production of antisense RNA. As is well known, a cell 
manufactures protein by transcribing the DNA of the gene encoding a protein to 
produce RNA. which is then processed to messenger RNA (niRMA) (e.g., by the 
removal of introns) and finally translated by ribosomes into protein. This process 
may be inhibited in the cell by the presence of antisense R_NA. The term antisense 

15 RNA means an RNA sequence which is complementary to a sequence of bases in the 
mRNA in question in the sense that each base (or the majority of bases) in the 
antisense sequence (read in the 3' to 5' sense) is capable of pairing with the 
corresponding base (G with C, A with U) in the mRNA sequence read in the 5' to 3' 
sense. It is believed that this inhibition takes place by formation of a complex 

20 between the two complementary strands of RNA, thus preventing the formation of 
protein. How this works is uncertain: the complex may interfere with further 
translation, or degrade the mRNA, or have more than one of these effects. This 
antisense RNA may be produced in the cell by transformation of the cell with an 
appropriate DNA construct designed to transcribe the non-template strand (as opposed 

25 to the template strand) of the relevant gene (or of a DNA sequence showing 
substantial homology therewith). 

The use of antisense RNA to downregulate the expression of specific plant 
genes is well known. Reduction of gene expression has led to a change in the 
30 phenotype of a plant, either at the level of gross visible phenotypic difference (e.g., 
lack of anthocyanin production in flower petals of petunia leading to colorless instead 
of colored petals (see van der Krol et al.. Nature, 333:866-869 (1988)), or at a more 
subtle biochemical level, for example, a change in the amount of polygalacturonase 
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and reduction in depolymerizaiion of pectin during tomato fruit ripening (Smitli et al., 
Nature. 334:724-726 (1988)). Another more recently described method of inhibiting 
gene expression in transgenic plants is the use of sense RNA transcribed from an 
exogenous template to downregulate the expression of specific plant genes 
(Jorgensen, Keystone Symposium "Improved Crop and Plant Products through 
Biotechnology", Abstract XI -022 (1994)). Thus, both antisense and sense RNA have 
been proven to be useful in achieving downregulation of gene expression in plants, 
which are encompassed by the present invention. 

Production of Transgenic Plants 

Techniques for transforming a wide variety of higher plant species using the 
recombinant expression cassettes hereinbefore described are well known and 
described in the technical and scientific literature. See. for example. Weising et al. 
Ann. Rev. Genet. 22:42 1-477 ( 1988). 

The hereinbefore described recombinant expression cassettes may be 
introduced into the genome of a desired plant host by a variety of conventional 
techniques. For example, the DNA construct may be introduced directly into the 
genomic DNA of the plant cell using techniques such as electroporation, PEG 
poration, particle bombardment and microinjection of plant cell protoplasts or 
embryogenic callus, or the DNA constructs can be introduced directly to plant tissue 
using ballistic methods, such as DN.A particle bombardment. In the alternative, the 
DNA constructs may be combined with suitable T-DNA flanking regions and 
introduced into a conventional Agrobacterium tumefaciens or Agrobacterium 
rhizogenes host vector. The virulence functions of the Agrobacterium host will direct 
the insertion of the construct and adjacent marker into the plant cell DNA when the 
ceil is infected by the bacteria. 

Transformation techniques are known in the art and well described in the 
30 scientific and patent literature. The introduction of DNA constructs using 

polyethylene glycol precipitation is described in Paszkowski ei al.. EMBOJ. 3:2712- 
2722 (1984). Electroporation techniques are described in Fromm ei al., Proc. Natl. 

33 




WOOO/53732 



PCT/USOO/06456 



Acad. Set. US.4 82:5824 (1985). Biolistic transformaiion techniques are described in 
Klein et ai. feature 321:10-12 (1987). 

.Agrobaciehum !iuiiefacie?-is-med'mted transformation techniques are well 
5 described in the scientific literature. See. for example Horsch et ai, Science 233:496- 
498 (1984). and Fraiey ei ai. Froc. Natl. Acad. Set. USA 80:4803 (1983). Although 
.igrobacieriitm is useful primarily in dicots. certam monocots can be transformed by 
Agrobacierium. For instance, Agrobacievium transformation of rice is described by 
Hiei etal. Plain J.. 6:271-282 (1994). 

10 

Transformed plant cells which are derived by any of the above transformation 
techniques can be cultured to regenerate a whole plant which possesses the 
transformed genotype. Such regeneration techniques rely on manipulation of certain 
phytohormones in a tissue culture growth medium, typically relying on a biocide 

15 and/or herbicide marker which has been introduced together with the zmet2a and/or 
zmet2b methyltransferase nucleotide sequence(s). Plant regeneration from cultured 
protoplasts is described in Evans et ai, Protoplasts Isolation and Culture, Handbook 
of Plant Cell Culture, pp. 124-176, MacMillian Publishing Company, New York, 
1983; and Binding; Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, 

20 Boca Raton, 1985. Regeneration can also be obtained from plant callus, explants, 

organs, or parts thereof. Such regeneration techniques are described generally in Klee 
et ai. .Ann. Ref of Plant Phys. 38:467-486 (1987). 

The methods of the present invention are particularly useful for incorporating 
25 the zmet2a and/or zmet2b methyltransferase nucleic acid(s) into transformed plants in 
ways and under circumstances which are not found naturally. In particular, the 
zmet2a and/or zmet2b methyltransferase(s) may be expressed at times or in quantities 
which are not characteristic of natural plants. 

30 One of ordinary skill in the art will recognize that after the expression cassette 

is stably incorporated in transgenic plants and confirmed to be operable, it can be 
introduced into other plants by sexual crossing. Any of a number of standard 
breeding techniques can be used, depending upon the species to be crossed. 
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The hereinbefore described expression cassettes can be inserted into a plant in 
order to reduce or alter the amount of DNA methylation in a plant. Preferably, such 
an expression cassette contains the 2met2a andyor 2met2b methyltransferase gene(s) 
inserted into the cassette in the antisense direction as described earlier. A reduction or 
alteration in the amount of DNA methylation in a plant can be used to stabilize 
transgene expression in a transgenic plant. 

One of the difficulties with the production of transgenic plants is that many 
transgenes are silenced or are not stable through successive generations, in many 
cases, transgene silencing is associated with increased DNA methylation. The 
hereinbefore described expression cassettes of the present invention containing the 
zmet2a and/or ziTiet2b methyltransferase gene(s) in the antisense direction can be 
inserted into a plant either before, concurrently with or after the insertion of another 
expression cassette containing a transgene which is to be expressed in the plant, such 
as, but not limited to, a resistance or drought tolerance gene, etc. The antisense RNA 
produced by the hereinbefore described expression cassette can then form a complex 
with the endogenous niRNA from the zmet2a and/zmet2b methyltransferase gene(s) 
within the plant. This complex should reduce or alter the amount of DNA 
methylation occurring in vivo in the plant. This reduction in DNA methylation should 
prevent the silencing of the desired transgene in the plant. 

In a similiar manner, the expression cassettes described herein can be used to 
modify or alter the yield or biochemical qualities of a plant. As discussed earlier, 
cenain genes in plants and animals are expressed differentially when transmitted 
thorough a male versus female parent. This phenomenon is known as imprinting. 
Imprinting is an epigenetic system correlated with DNA methylation. A reduction or 
alteration of DNA methylation in a plant by transforming a plant with an expression 
cassette containing the zmet2a and/or zmet2b methyltransferase gene(s) in the 
antisense direction may affect the yield and biochemical qualities of a plant. 

The hereinbefore described expression cassettes can also be used to silence the 
expression of a particular targeted gene in plants in vivo. More specifically, the 

35 
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expression cassettes of the present invention containing a tissue-specific promoter and 
the zmet2a and/or zmet2b methyitransferase gene(s) in the sense direction can be 
inserted into a plant. The tissue-specific promoter will direct expression of the 
zmet2a and/or zmei2b methyitransferase gene(s) in a area containmg the desired 
targeted gene. Translation of the zmet2a and/or zmet2b methyitransferase gene(s) in 
the specific area will result in an increase in methylation in the area of the targeted 
gene. This increase in methylation can silence the targeted gene. 

Transgenic plants containing the expression cassettes described herein and 
which exhibit a reduction in DNA methylation can be identified by using methylation 
sensitive restriction enzymes or High Performance Liquid Chromatography. 
Techniques for using methylation sensitive restriction enzymes and High Performance 
Liquid Chromatography are well known in the art. Transgenic plants containing the 
expression cassettes described herein and which exhibit an increase in DNA 
methylation can be identified by using a Northern Blot analysis which is well known 
in the art. 

Additionally, the hereinbefore described expression cassettes can be used in 
gene therapy for human diseases which are caused by the amplification of 
trinucleotide repeats. 

The following Examples are offered by way of illustration, not limitation. 

EXAMPLES 

EXAMPLE 1 -Cloning and Sequencing of Zmet2a 
a. Cloning and Sequencing 

A partial cDNA clone (CGET064) from an immature tassel cDNA library was 

obtained from Pioneer Hi-Bred International (Des Moines, Iowa). This clone was 
identified in an expressed tag sequence (hereinafter "EST") database using known 
30 DNA methyitransferase sequences for comparison. This original cDNA clone 
contained sequences from bp 151 to bp 2569 shown in FIG. 1 A and IB. The 
sequence of this clone, which represents the 3' end of the transcript was used to design 
forward and reverse primers for 5' and 3' Rapid Amplification of cDNA Ends 

36 
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(hereinafter "RACE"). flACE was conducted using the Marathon cDNA 
Amplification Kit (available from Clontech) on cDNA prepared from Mol7 10 day 
old seedling mPLNA. Mo 17 is publically available from the National Seed Storage 
Lab (Fon Collins, Colorado). RACE products were isolated and ends sequenced 
5 using Marathon primers and gene specific primers. The remaining sequence was 

obtained from PCR products by primer walking. The primers used were AP2, IF, IR, 
2R. 3R. 4F. 5F, 8R, 8F, 9R, 9F, 14F, 17F. and RaceRT (see FIG. 3). Two sequencing 
passes were made on the Mol 7 cDNA ends and four sequencing passes were made on 
the interv ening regions, three from Mo 17 cDNA and one from B73. B73 is publically 
10 available from the National Seed Storage Lab (Fort Collins, Colorado). A consensus 
sequence for the coding region was generated and is shown in FIG. lA and IB. 

Genomic sequence spanning primers IF and IR were obtained from Pioneer 
Hi-Bred International. To obtain the remaining genomic sequence of zmet2a, the 

15 CGET064 clone was used to probe a Mol 7 genomic library (Stratagene). Lambda 
clones 4a. 4c. 4dl and 4d2 were determined to be positive clones containing a 
sequence identical to CGET064. Lamda clone 4a did not contain the full length gene, 
therefore, sequence data was obtained from clone 4c. No analysis of clones 4dl or 
4d2 was conducted. Clone 4c was subcloned into pGEM7zf(+) (Promega) using 

20 double digests involving Hindlll, Xhol, EcoRl, and BamWl. Genomic sequence was 
obtained from a combination of subclones pHX8 (bp 73 II -8878), pHX9 (bp 9173- 
10135). and pBl l(bp 5269-8447) and by primer walking using primers T7, Sp6, 
M13F, M13R, Seq2FN, Seq2RN, S3F, S3R, 7F, 8eR, 9F. 9R, lliR, lliF, I2iR, 12iF, 
13iR, 13iF, 14F, 14R. 15R, 15F, 16R, 16F, 17R, 17F, 18R, 18F, and RaceRT (see 

25 FIG. 3). Borders of the Mu insertion of zmet2a::MLIl were sequenced from PCR 

products using primer 5F and a Mu primer (see FIG. 3). Map locations of the zmet2a 
primers are shown in FIG. 5. 

PCR products were sequenced using Big Dye terminator cycle sequencing on 
30 an ABI sequencer (Perkin-Elmer Applied Biosystems) at the University of Wisconsin 
Biotechnology Center Sequencing Facility (Madison, WI). Sequence data was 
processed using computional tools available through the World Wide Web 
(hereinafter, "WWW"), summarized in FIG. 6. 
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b. Mutant Analysis 

A mutant allele called (zmet2a::Mul ) was obtained from Pioneer Hi-Bred 
Intemationars TUSC system. This mutant allele contains a Mutator Iransposable 
5 element insertion and was identified in a Mittaior population using a Mii specific 
primer and a zmetZa gene specific primer. Since the Mutator population is quite 
variable, heterozygous zmet2a:;mul F2 seed was advanced by selfing at the 
University of Wisconsin West Madison Agronomy Farm (Madison, Wisconsin), the 
University of Wisconsin Wahiut Street greenhouses (Madison, Wisconsin), and at the 
10 University of Wisconsin winter nursery in Puerto Rico to produce the F4 derived F5 
segregating family primarily used in this example. 

DN.A from 15 plants of the Fj derived F5 segregating family was used for 
HPLC analysis. A subset of these plants was used for Southern analysis. The S"^ to 

15 7''' immature leaf tips were collected and immediately frozen in dry ice. Tissue was 
ground in liquid nitrogen and DNA was extracted using a modified CTAB method of 
Saghai-Maroof etal. (Proc. Natl. Acad. Sci. USA 81:8014-8018 (1984)). Tissue was 
incubated in CTAB (Sigma) extraction buffer for 2 hours at 65 °C, extracted with 
chloroformyisoamyl alcohol, treated with 0.5 mg RNase A (Sigma) for 30 minutes at 

20 37 °C, extracted again with choroform/isoamyl alcohol, precipitated with 

isopropanol, washed with lOmM ammonium acetate/76% ethanol, and resuspended in 
TE. 

Plants were genotyped by Southern analysis. DNA (10|j.g) was digested with 
25 BamHl and EcoRi which cut on each side of the Mu insertion. The digested DNA 
was electrophoresed through a 0.8% agarose 0.5X TBE gel. DNA was transferred to 
Immobillon nylon membrane (Millipore) with 5X SSC. Blots were UV cross-linked 
for 25 seconds and dried at 80 °C for 1.5 hours. Pre-hybridization was carried out in 
5X SSC, 50mM Tris pH 8.0, 0.2% SDS, 10 mM EDTA, 2.5X Denhardts solution, and 
30 0.1 mg/mi single stranded sheared herring DNA overnight (8-16 hours) at 65 °C. 

Hybridization conditions were similar to pre-hybridization except for the addition of 
5% dextran sulfate to the hybridization solution. Probes (25-50 ng) (clone CGET064 
for genotyping) were radioactively labeled using a random priming reaction 
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containing 50 /aCi of P-32 labeled dCTP. Following overnight hybridization at 65 °C, 
blots were washed 2X (0.1 5X SSC, 0.1% SDS) for 30-45 minutes at 65 °C. 
Hybridized blots were then exposed to Kodak Biomax film. 

5 Southern analysis with methylaiion sensitive restriction enzymes was 

conducted in a similar manner except that 5 i^g of DNA was digested. Enzymes 
included in the study were; Apah Avail. BamHl, BstNl, Clah EcoOm, £coRl, 
EcoRlL HaelU. Hinfl. Hhal Hpall Mspl Pstl. PvuU. Sad. 5au3a, Scr¥l.SmaL 
Xliol. Probes for repetitive sequence regions of the maize genome including a 9 kb 

10 clone for the maize 26s-5.8s-17s repeat (reviewed in McMuUen et al.. Molecular 

Analysis of the Nucleolus Organizer Region in Maize. In: Chromosome Engineering 
in Plants: Genetics. Breeding, and Evoluation. Gupta PK. Tsuchiya T. (eds). pp. 561- 
576 (1991)). the 5s ribosomal subunit clone (Mascia et al.. Gene, 15:7-20 (1981)), and 
centromere probe pSau3a9 (Jiang et al, Proc. Natl. Acad. Sci. USA 93: 14210-14213 

!5 (1996)) were used to analyze changes in methylation due to zmet2a::Mul. 

HPLC was conducted according to a modified protocol of Gehrke et al.. (J. 
Chromai. 301: 199-219 (1984)). Duplicate preparations for each of fifteen plants were 
analyzed. Twenty-five micrograms of DNA was diluted with water to a volume of 50 

20 ^1, denatured at 96 °C for 5 minutes and immediately placed on ice. One hundred 
microliters of 30mM ammonium acetate (pH 5.3). 5 ul of 20mM Zinc Sulfate and 10 
ul Nuclease PI ( Img/ml in 30mM ammonium acetate fpH 5.3) was added and 
incubated at 37 °C for 2 hours. This reaction cleaves 5' mononucleotides from single 
stranded DNA. The pH was adjusted with 20 fal of Tris (pH 8.5) and approximately 

25 15 units of Calf Intestinal Alkaline Phosphatase was added and incubated at 37 for 
an additional 2 hours which converts the nucleotides to nucleosides. Samples were 
frozen at -20°C until HPLC analysis. 

HPLC analysis was conducted at the University of Wisconsin Biotechnology 
30 Center. A volume of 50 \i\ was injected into a Brownlee Lab Spheri-5 RJP-8 column. 
Nucleosides were separated with a flow rate of 0.75 ml/min using a gradient program 
consisting of 30 minutes in buffer A (0.05M Potassium Phosphate pH 4.0. 2.5% 
methanol), 19 minutes in buffer B (0.05M Potassium Phosphate pH 4.0, 20% 
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methanol). The column was flushed with 70% methanol for 13 minutes and then re- 
equilibrated with buffer A for 23 minutes before the injection of the next sample. All 
samples were analyzed on a Beckman System Gold chromatograph and nucleosides 
detected at A260nm and A280nm. Nucleoside and nucleotide standards (Sigma) were 
used to determine nucleoside peak positions and to create a standard curve to 
determine nucleoside concentration. The ratio of 5-methylcytosine to total cytosine 
was calculated and statistical analysis conducted using SAS. 



To test remethylation as an indication of de novo methylase activity, an Fi 
10 hybrid of an F4 line homozygous for zmet2a::Mul and the inbred line Mol7 was 

backcrossed to the nonmutant Moi 7 parent to generate plants homozygous wild-type 
and plants heterozygous for2met2a::Mul. Seedlings of the F|. the BC| progeny, the 
Mo 17 parent and a sib of the F4 zmet2a::Mul parent were grown m the greenhouse 
and DNA extraction and Southern analysis conducted as previously described. DNA 
15 was digested with Msp\ and Pstl and probed with the aforementioned repetitive 
clones. 



c. Expression Analysis 

The expression of zmet2a was determined by hybridizing the zmet2a cDNA 
20 probe to a Southern blot of cDNA's prepared from different tissues and tissues at 
different stages of development. Tissues mcluded in this study are embryos 24 days 
after pollination, 10 day seedlings, immature ear, immature tassel, immature leaf from 
mutant and nonmutant plants, and roots. Total KNA was extracted usmg Trizol 
(Gibco/BRL) according to the manufacture's protocol. The PolyAttract System 
25 (Promega) was used to isolate mRNA's from all tissues except 10 day seedlings 
which was isolated using oligo dT cellulose columns (Pharmacia). cDNA was 
synthesized from the isolated RNA's using Marathon cDNA Amplification Kit 
(Clontech). 



30 d. Results 

zmetla shares sequence similarity with other DNA methyltransterases 

zmet2a is a member of a small gene family. Three cohybridizing bands are 
observed on a Southern blot of B73 DNA digested with HindlW and probed with 

40 
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clone CGET064 which does not contain a HindlU restriction site (see FIG. 7). 
znnet2a. which maps to the long arm of chromosome 10, is coded on 20 exons with 19 
intervening introns (FIG. 5). The inferred protein using the first predicted translation 
start site located within a consensus Kozak sequence (Kozak, J. Cell. Biol., 1 15:887- 
5 903 (1991)) is composed of at least 912 amino acids with a predicted mass of 101 Kd 
(Kilodaitons). A protein of this size with an affinity for CpNpG sequences was 
isolated mPisum salivitm by Pradhan and Adams (Plain J... 471-481 (1995)). 

Comparisons with Arabidopsis chromomethylase, CMTl 

10 ^equence of zmet2a (FIG. 1 A and IB) reveals that it lacks the large N- 

terminal lomain found in the maintenance enzymes but does possess the six highly 
consen-edViotifs of the C-tennina! catalytic domain. Database searches using 
BLAST (htta://www. ncbi.nlm.nih.gov/gov/BLAST/) show that zmet2a has highest 
sequence honrology to the Arabidopsis chromomethylase, CMTl (see Henikoff and 

15 Comai, GenetiX 148:307-318 (1998)) with 44% identity. 57% conservation. The N- 
terminal region isMarger in zmet2a; however, there is an additional downstream 
predicted start site^^lso within a consensus Kozak sequence, that codes for an enzyme 
of 809 amino acids which is more similar in size to the most closely related CMTl 
which is composed orw9I amino acids. 

20 

Alignments of zmet2a with CMTl and the catalytic domains of Arabidopsis 
METl and maize zmetl maintenance enzymes show conservation in the important 
functional motifs I, IV, VI, VIII. IX and X providing evidence that it is indeed a 
DNA methyltransferase (FIG. 8). zmet2a and CMTJ are 87% conserved across the 

25 defined six conserved domains, as shown in the underlining in FIG. 8. Zmet2a and 
CMTl also have 60% conservation in the variable region sequence between the 
defined underlined motifs VIII and IX in FIG. 8, which contains a region known as 
the target recognition domain in the bacterial methyltransferases. The bacteria! 
methylase M.Hhal has been crystalized and functions deduced for the conserved 

30 amino acids (Cheng et al.. Cell., 74:299-307 ( 1 993)). The zmet2a amino acids 
involved in catalysis were predicted by comparison to M.Hhal. The amino acids 
interacting with SAM and with cytosine are summarized in FIG. 9. 
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zmet2a mutant plants have reduced methylation at CpNpG sites 

A reverse genetics approach was used to ascertain the function of zmet2a. A 
Fo family segregating for a Mutator (Mu) insertion in the exon encoding motif IX was 
identified using a PCR primer for Mu and a gene-specific primer for zmet2a. This 
5 allele is called zmet2a::Mul. The insertion of Mu into exon 19 results in a transcript 
that would code for a protein truncated at the point of the Mu insenion in motif IX 
due to the introduction of a stop codon. The resulting protein is expected to be 
dysfunctional since it lacks Motif X which is required for S-Adenosyl methionine 
(hereinafter -'SAM") binding (Cheng et al. Cell 74:299-307 ( 1 993)). 

10 

Reduced methylation observed by restriction enzyme analysis 

To reduce the genetic background variation associated with the heterogeneous 
origin of the Mutator population, restriction enzyme analysis was conducted on a F4 
derived F5 family segregating for zmet2a;:Mul . Restriction enzyme isoschizomers 

i5 Hpall/Mspl in addition to other methylation sensitive enzymes were used to 

determine methylation pattern differences among the three genotypic classes. Hpall 
and Mspl both recognize the sequence CCGG but differ in their sensitivity to 
methylation. Hpall digestion is inhibited unless both cytosines are unmethylated 
whereas Mspl can digest C"^'^CGG sequences but not "^CCGG sites. The methylation 

20 status at CpG sites can be accessed by digesting with Hpall and similarly Mspl 

digestion is used to determine the state of methylation at CpCpG sites specifically and 
may provide a general indication of methylation changes occurring at CpNpG sites. 

Results indicate significant reductions in cytosine methylation at "'^CCG sites 
25 as indicated by a more complete digestion by Mspl in plants homozygous for 

zmet2a::Mul (FIG. 10 A-C). Plants heterozygous for zmet2a::Mul were intermediate 
in their digestion pattern. Although the frequency of methylated cytosines is much 
higher at CpG sequences, no changes in methylation were observed among the 
genotypic classes when digested with Hpall (FIG. 10 A-C). 

30 

Isoschizomers, BstNl and EcoRll recognize the sequence CC(A/T)GG. BstNl 
is not sensiuve to cytosine methylation and EcoRll is inhibited at C'"'^C(A/T)GG sites. 
Nearly all of these sites are methylated in repedtive sequences as a low level of 
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EcoRll digestion is observed only in zmei2a::Mul plants (See FIG. 1 1), whereas 
digests with BstW are completely digested to lower molecular weight fragments for 
all genotypes. These methylated sites may not be subject to zmet2a activity but may 
instead be methylated by another member of the zmel2a gene family or by znieiJ or 

5 possibly de novo methylated after each cell cycle by zmei3. Other restriction enzymes 
were used to clarify the apparent sequence specificity of methyiation reduction at 
CpNpG sites. As with the isoschizomers, no digestion differences are observed with 
CpG sensitive enzymes Hhal [C^^CGC] and Clal [AT'"''CGAT]. More complete 
digestion is observed in plants homozygous for zmet2a::lVlul with enzymes sensitive 

10 to methyiation at CpNpG sites. FIG. 12 shows digestion patterns for enzymes 

sensitive to methyiation at CpNpG sites: EcoRll, Psth BamHl, and Avail. In 
addition to EcoRll as previously mentioned, reduced methyiation in one or more of 
the repetitive regions was observed with Bglll [AGAT^CT], Pstl ["^'^CTGCAG], 
BamHl [GGAT'"'CC], and Avail [GG(A,T)"''C'"'C]. It should be noted that Avail 

15 may include some CpG overlapping sites. Subtle differences in digestion patterns of 
one or more of the repetitive sequences were also observed with Sau3al [GAT"^^C], 
Apal [GGG""'CC"''C], and Xhol ["''CT^CGAG]. With these enzymes it is not 
possible to unambiguously determine whether the source of the difference is CpG or 
CpNpG methyiation. Differences were also observed with ScrFl [C^^CNGG] which 

20 duplicates the targeted sequences and methyiation sensitivities of EcoRlh Mspl and 
Hpall. Although in many cases the observed reduction in CpNpG or CpN 
methyiation is minimal, any cases of reduced methyiation that could be 
unambiguously attributed to CpG sites have not been observed. 

25 Reduced methyiation observed by HPLC 

To further assess the extent of methyiation reduction caused by the 
zmet2a::Mul allele, HPLC was used to determine the proportion of methylated 
cytosines in the same F5 plants used for restriction enzyme analysis. An 1 1 .6% 
decrease in 5-methylcytosine was observed in plants homozygous for zmet2a::Mul 
30 relative to siblings homozygous for wild-type zmet2a (FIG. 12). Heterozygotes were 
intermediate in 5-methylcytosine content. Differences between the genotypic classes 
are statistically significant at a< 0.0001. Since most methyiation is found at CpG 

sites (Gruenbaum et al., Nature, 292:860-862 (198 1)), a 12% decrease in the total 5- 
43 



wo 00/53732 



PCT/USOO/06456 



methylcytosine content likely accounts for a substantial reduction in methylation at 
CpNpG sites if the reductions are confined to these sequences. 

Several generations of inbreeding does not reduce methylation levels beyond 
5 that which is observed in the F2 homozygous mutant (FIG. 13). In addition, it was 
also obsen'ed that plants restored to a normal zmet2a genotype from zmet2a:;Mul 
heterozygotes appeared to have near normal levels of methylation. 

Methylation is restored after segregation away from zmet2a::Mul 

1 0 To test remethylation. a nonmutant line. Mo 1 7, was crossed to a homozygous 

mutant line, the resulting Fi was then backcrossed to the nonmutant Mol 7 parent line. 
Restriction enzyme analysis of backcross progeny show all individuals without the 
Mil insertion have remethylated to levels similar to the backcross parent (see FIG. 14). 
The increased levels of methylation obser\'ed in normal BC| progeny appear to be 

1 5 higher than that expected from the segregation of normal Mo 1 7 derived chromosome 
segments and low methylation mutant segments, which would result in a pattern 
intermediate between the Fi and the nonmutant parent. These results indicate either 
that zmet2a has in vivo de novo activity and is responsible for establishing CpNpG 
methylation patterns, or that a separate de novo methyltransferase functions only early 

20 in development and that zmet2a is responsible for maintaining these patterns. These 
resuhs on remethylation are in contrast to those of the reduced methlyation patterns of 
Arabidopsis mutants. Backcross progeny, lacking an antisense METl transgene or 
the ddml mutation, derived from mutant plants outcrossed to normal plants showed 
very slow remethylation and required several generations to restore methylation to 

25 normal levels (Ronemus et al.. Science, 273:654-657 (1996), Vongs et al., Science, 
260:1926-1928 (1993), Kakutam etal., Genetics, 151:831-838 (1999)). Similar 
results were observed in selfed progeny from hemizygous antisense Med plants that 
did not inherit the transgene (Finnegan et a!., Proc. Natl. Acad. Sci. USA 93:8449- 
8454 (1996)) however a centromeric region and some single copy sites did 

30 remethylate in the first generation (Finnegan et al., Anmi. Rev. Plan! Physiol. Plant 
Mol. Bio.. 49:223-247 (1998)). 
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Other DNA methyliransferases that lack the large N-terminal domain have 
been presumed to be de novo enzymes, however, evidence remains insufficient. //; 
vitro expression of Dnml3a and DnmBh (Okano et al.. Nature Genetics. 1 9:2 19-220 
(1998)) did not show a specific preference for hemimethylated DNA or 
5 nonmethylated DNA and in vivo expression in Drosoplnla (Lyko et al.. Nature 

Genet., 23:363-366 (1999)) further confinn de novo activity, whereas Dnmll (Okano 
et al.. Nucleic Acids /Jes., 26:2536-2540 (1998)) was shown not to effect de novo or 
maintenance methylation in mice. Mascl, in ascobolus, is purported to have de novo 
activity through its effect on methylation induced premeioticaily (MIP) (Malagnac et 
10 a!., Cell. 91:281-290 (1997)). Another Ascoboius methyl transferase Masc2 was 
found to be dispensible for maintenance and de novo methylation in vivo (Malagnac 
et al. Mol Micro. 3:33 1-338 (1 999)). 

A chromodomain is present in zmet2a 

15 A distinguishing feature of zmet2a, like CMTl . is the presence of the 

chromodomain. Chromodomains have been demonstrated to target proteins to 
heterochromatic regions and may also be a site of protein-protein interactions 
(reviewed by Cavalli and Paro, Curr. Op. Cell Biol., 10:354-360 ( 1998)). The 
presence of the chromodomain in zmet2a and CMTl potentially suggests targeting of 

20 the methyitransferase to chromatin complexes or a role of the methyitransferase in 
chromatin formation and stability. Furthermore, the observation that zmet2a affects 
CpXpG methylation may also implicate protein targeting through the chromodomain 
and targeting of methylation patterns. Stable transcriptionally active or silent states 
may be determined by the formation of chromatin complexes. The mechanisms 

25 involved in the formation of silencing complexes remain unknown. However, there is 
evidence of the involvement of methylation in transcriptionally silenced states which 
involve methylation binding proteins, transcriptional repressor complexes, and histone 
deacetylases (Nan et a!.. Nature, 393:386-389 ( 1998). Wade et al. Nature Gen.. 23:62- 
66 (1999), Ng et al.. Nature. Gen. 23:58-61 (1999)). 

30 

zmet2a is expressed throughout plant development. Expression is higher in 
the rapidly dividing tissues of seedling, immature ear and embryos (FIG. 15) 
consistent with the role of methyliransferases in methylating newly synthesized DNA. 
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Low expression of zmet2a in terminal tissue (leaves) could serve a protective function 
against invading DNA if this enzyme does have a de novo function. 

Example 2 - Cloning and Sequencing of the maize retrotransposon SPRlTE-1 

5 This example describes the cloning and sequencing of a maize retrotransposon 

that is inserted into an intron of 2met2a and is referred to herein as "SPRITE- 1". 

a. Introduction 

Within the genomes of most organisms are DNA elements that can be 
10 considered parasitic. These elements confer no phenotype of their own and function 
only for their propagation and insertion elsewhere in the genome. There are two 
major classes of these elements based on the mechanisms of propagation. One class 
propagates using D>\ A-mediated mechanisms where the element does not code for 
any polymerase and entirely depends on the replication machinery of the host. This 
15 class mcludes the Ac, Spm, and Mu transposable element systems. The other major 
class is known as retrotransposons, retrotransposable elements or retroelements 
(reviewed in Grandbastien, Trends in Genetics 8:103-108 (1992); Eickbush, Origin 
and Evoluiionaiy Relationships of Reiroelemenis. In The Evoluiionaiy Biolog}' of 
Viruses (Morse. S.S., ed).) (1994); Wessleret al., Current Biology, 5:814-821 (1995); 
20 Bennetzen. Genome, 37:565-576 (1996)). These elements are not able to excise from 
one site and insert into another, as the previously mentioned class is capable, but 
replicate by an RNA-mediated process. The retroelements code for a reverse 
transcriptase which is a DNA polymerase that uses RJMA as a template. 

25 There are several types of retroelements. The main types are retroviruses, 

long-termina!-repeat (hereinafter "LTR") retroelements, and non-LTR retroelements. 
Retroviruses are infectious and have not been found in plants, although one plant 
LTR-retroeiement. SIRE-1 from soybean has coding sequences similar to that of a 
retoviral envelope protein (Laten el al., Proc. Natl. Acad. Sci., 95:6897-6902 (1998)). 

30 The non-LTR class is mainly composed of long interspersed nuclear elements 

(hereinafter "LINEs") and short interspersed nuclear elements (hereinafter "SINEs"). 
These elements have been found in plants. Less is known about this class than the 
others. They do differ from LTR-retroelements in that they contain a poly-A tail at 
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their 3' end. The LTR-retroelement class has been more extensively described in 
plants than the other classes of retroelements. The LTR-retroelements are usually 
categorized as one of two groups based on the simiiarity with the first elements 
described in yeast and Drosophila. One group shares similarity with the Ty3 
5 elements from yeast and the gy'psy element oi Drosophila (Marlor et aL, Mol. Cell. 
5/0/., 22:829-846 (1986); Clark etal,. J. Biol. C/^em., 263:1413-23 (1988)). The other 
group has similarity with the Tyl elements of yeast and the copia element of 
Drosophila. The element identified in this study is of the lyMcopia class (Clare and 
Farabaugh. P/-OC. Nail. Acad. Sci. USA, 82:2829-2833 (1985); Mount and Rubin, Mol. 
10 Cell. Biol. 5:1630-1638 (1985)). 

The general structure of a LTR-retroelement is depicted in FIG. 16A. These 
elements are similar in their structure and replication to retroviruses (reviewed in 
Witcomb and Hughes, Ann. Rev. Cell Biol., 8:275-306 (1992), Eickbush, Origin and 

15 Evolutionary Relationships of Retroelements. In The Evolutionary Biology of Viruses 
(Morse, S.S., ed.). New York: Raven Press, pp 121-157 (1994), Bennetzen, Trends in 
Microbiology, 9:347-353 (1996)). These elements have direct repeats at the termini 
as opposed to the DNA based elements that have inverted terminal repeats. 
Downstream from the 5' LTR is a primer binding site for a host tRMA that primes the 

20 first DNA strand synthesis using reverse transcriptase. One or more open reading 

frames that code for gag, a protease, an integrase, a reverse transcriptase, and RNaseH 
are located downstream from the primer binding site. After the coding region is a 
polypurine tract followed by the 3' LTR. lyZlgy^psy and ly\l copia elements differ in 
the postion of the integrase coding region. Ty3/gypsy element have the integrase 

25 domain at the end of the coding region whereas Ty \l copia element have it positioned 
between the proteinase and reverse transcriptase regions. The gag gene encodes 
proteins for the nucleocapsid and the highly conserved cysteine-histidine nucleic acid 
binding domain (CX2CX4HX4C), The protease processes the poiyprotein into its 
individual components. The integrase funcdons to insert a newly replicated element 

30 into the host DNA. The reverse transcriptase synthesizes the first DNA strand from 
the transcribed RNA of the element. The RNase degrades the RNA following first 
strand synthesis. Retroelements rely on the RNA polymerase of the host for 
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transcription and the host DNA polymerase for second strand DNA synythesis to 
complete replication. 

Using PCR based methods, retroeiements were found within nearly every 
5 species of the plant kingdom studied (Flavell et al., Nuc. Acids Res. 20:3639-3644 
(1992); Voytas et al.. Proc. Nad. Acad. Sci. USA 89:7124-7128 (1992)). Despite the 
ubiquitous nature of retroeiements. there is great heterogeneity among the element 
within and among species (Flavell et a!.. Nuc. Acids Res. 20:3639-3644 (1992). Wang 
et al., Plan! Mol. Biol.. 33:1051-1058 (1997), Pearce et al.. Mol. Gen. Genet., 
10 250:305-315 (1996)). 

Retroeiements are found to be distributed over the entire lengths of 
chromosomes mAvena saliva (Katsiotis et al.. Genome, 39:410-417 (1996)) but have 
also been found to be less abundant in heterochromatin, nucleolar organizer regions, 

15 centromeres and telomeres (Pearce et al, Mol. Gen. Genet., 250:305-315 (1996); 
Moore et al., Genomics, 10:469-476 (1991); Aledo et al., Theor. Appl. Genet., 
90: 1 094- 1 1 00 ( 1 995); Brandeis et al.. Plant Mol. Biol. ,33:11-21(1 997)). 
Retroelement-like sequence were found in centromeric regions of grass chromosomes 
(Miller et al., Genetics, 1 50: 1 6 1 5- 1 623 ( 1 998)). Many retroeiements were discovered 

20 by their associations with plant genes (Johns et al., EMBO J., 4:1093-1 102 (1985); 
Grandbastien et al.. Nature, 337:376-380 (1989); Camirand et al., Mol. Gen. Genet., 
224:33-39 (1990)); White et al., Proc. Natl. Acad. Sci. USA, 91:1 1792-1 1796 (1994)); 
Hu et al., Mol. Gen. Genet., 248:471-480 (1995); Bi and Laten, Plant Mol. Biol, 
30:1315-1319 (1996), Royo et al., Mo/. Gen. Gs«e/., 250:180-188 (1996); Kumekawa 

25 etal.,Mc)/. Gen. G^«e/., 260:593-602 (1999)). Many more retroeiements or 

retroelement fragments have been identified using PCR with degenerate primers 
(Voytas et al., Proc. Natl. Acad. Sci. USA, 89:7124-7128 (1992)); Flavell et al., Nuc. 
.4ac/^/?e5., 20:3639-3644 (1992); Flavell etal., Mo/. Gen. Genet., 231-233 (1992), 
Pearce et ai., Mol. Gen. Genet., 250:305-315 (1996); Katsiotis et al., Genome, 39:410- 

30 417 (1996); Wang et al., Plant Mol. Biol., 33:1051-1058 (1997)). Others have been 
identified through studies for other purposes (Bhattacharyya et al.. Plant Mol. Biol., 
34:255-264 (1997); Vicient and Martinez-lzquierdo, Gene, 184:257-261 (1997); 
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Manninen and Schulman, Plant Mol. Biol. 22:829-846 (1993)) or by genome 
sequencing projects. 

The Tyll gypsy and the Tyl/copia elements can be found in large numbers and 
5 may contribute up to 50% of the nuclear DNA of the maize genome (SanMiguei et al.. 
Science. 274:765-768 (1996)). A 280 Kb region of the maize genome containing the 
Adhl-F and u22 genes was composed of retroelements, from 10 different families, 
inserted within each other. The copy number ofTylfcopia elements varies 
considerably. For example, the Tal elements of Arabidopsis (Voytas el al., Genetics, 
10 126:713-721 (1990)) and the Tstl element oiSolamim tuberosum (Camirand et al., 
Mol. Gen. Genet., 224:33-39 (1990)) have one to only a few copies whereas the maize 
element PREM-2 (Bennetzen. Trends in Microbiolog}\ 9:347-353 (1996)) and the 
BARE-1 element oiHordeum vulgare (Manninen and Schulman, Plant Mol. Biol. 
22:829-846 (1993)) may be present at 30,000 or more copies. 

15 

The differences in copy number infer differences in expression of 
retroelements. Retroelements are not expressed at high levels as only a few examples 
of activity have been observed. The Bsl and Zeon-1 elements of maize (Johns el al., 
£M50/, 4: 1093-1102 (1985); Hu etal.,Mo/. Gen. Genet., 248:471-480 (1995)); the 

20 Tos elements of rice (Hirochika el al., Proc. Natl. Acad. Sci. USA 93:7783-7788 

(1996)) ihe TntI and Ttol elements of tobacco (Grandbastien et al.. Nature, 337:376- 
380 (1989); Hirochika, EMBOJ., 12:2521-2528 (1993)) and the Tnp2 element of 
Nicotiana plumbaginifolia have shown evidence of activity. Retroelement expression 
is higher in plant tissues under stressful conditions. The Ttol, Tto2 of tobacco and 

25 Tos 17 element of rice were shown to be activated in tissue culture (Hirochika, EMBO 
/, 12:2521-2528 1993, Hirochika et al., Proc. Natl. Acad. Sci., USA (1996)). The 
promoters of the BARE-1 element of barley and the Tnt-1 element of tobacco drove 
expression of reporter genes in protoplasts (Suoniemi et al.. Plant Mol. Biol.. 3 1 :295- 
306 (1996); Pouleau et al., EMBO J., 10:191 1-1918 (1991)). 

30 

Biotic stresses such as viral, fungal and bacterial infection and abiotic stress 
such as wounding have also been shown to initiate the expression of Tntl and Ttol 
retroelements (Pouleau el ai., Plant J.. 5:535-542 (1994); Moreau-Mhiri et al. Plant 
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J., 9:409-419 (1996); Vemheltes et al.. Plant Mol. Biol.. 35:673-679 (1997); Mhiri et 
al.. Plant Mol. Biol. 33:257-266 (1 997); Grandbasiien el al, Geneiica. 100:241-252 
(1997); Takeda et al.. Plant Mol. Biol.. 36:365-376 (1998)). The Bsl element of 
maize may have been mobilized prior to insertion in tlie Adhl gene by infection with 
5 the barley stripe mosaic virus (Johns et al., £A'/50 J.. 1093-1102 (1985)). Only the 
expression of BARE- 1 has been observed in normal unstressed barley leaves 
(Suoniemi et al.. Plant Mol.. Biol, 31:295-306 (1997)). 

Under normal conditions, retroelements are transcriptionally inactive and are 
10 thus transposiiionally inactive. Mechanisms within the host must exist to regulate the 
activity of the retroelements to prevent potentially deleterious mutations that could 
occur if retroelement transposition was unchecked. Most retroelements are highly 
methylated (Bennetzen et a!.. Genome. 37:565-576 (1994)) and possibly in 
helerochromatic regions and may not be accessible to transcriptional machinery. 
15 Though silenced in most cases and active in stressful situations, it has been suggested 
that retroelement transposition may create mutations that may be of selective 
advantage and provide a means for adaptation (McClintock, Science, 226:792-801 
(1984)). 

20 b. Cloning and Sequencing of SPRITE-1. 

A zmet2a genomic clone was isolated from a lambda library (Stratagene) 
constructed from Mol 7 genomic DNA. The sequence was obtained from subclones 
or from PCR products by primer walking. Fragments were sequenced using Big Dye 
terminator cycle sequencing on an ABl sequencer (Perkin-Elmer Applied Biosystems) 
25 at the University of Wisconsin Biotechnology Center Sequencing Facility, Madison, 
Wisconsin. 

Expression analysis was conducted on cDNA's prepared using Marathon 
cDNA Amplification Kit (Clontech) according to the manufacturer's protocols from 
30 mRNA isolated from a Mol 7 10 day old seedling, Mol 7 immature tassel, B73 
immature ear. Black Mexican Sweet (BMS) callus, Mol 7 embryo 24 days after 
pollination. W22 pollen, young roots, and immature leaf tissue from zmet2a normal 
and mutant plants. Total RNA was extracted using Trizol (Gibco/BRL) according to 
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manufacturer's protocol. Seedling mRNA was isolated using oligo dT cellulose 
columns (Pharmacia) ail other mRNA isolated using the PolyAttract system 
(Promega), 

c. DNA extraction and Southern analysis for genotyping and methylation 
analysis. 

DNA was extracted from immature leaf blades as described in Saghai Maroof 
et ai. (Pmc. Natl. Acad. Sci. USA 81:8014-8018 (1984)). The copy number of 
SPRITE- 1 was determined by digesting DNA (lOjig) with EcoRi which does not cut 
within the element. The digested DNA was electrophoresed through a 0.8% agarose 
0.5X TBE gel. Gels were treated with 0.25N HCl for 15 minutes, denatured in 0.2N 
NaOH and 0.6 M NaCl for 30 minutes, then neutralized in 0.5 M Tris 1 .5 M NaCl for 
30 minutes. DNA was transferred to Immobilon nylon membrane (Millipore) with 
5X SSC. Blots were dried at 80 °C for 1.5 hours. Pre-hybridization was carried out 
in 5X SSC, 50 mM Tris pH 8.0, 0.2% SDS, 10 mM EDTA, 2.5X Denhardt's solution, 
and 0.1 mg/ml single stranded sheared herring DNA overnight (8-16 hours) at 65 °C. 
Hybridization conditions were similar to pre-hybridization except for the addition of 
5% dextran sulfate to the hybridization solution. The blot was probed with a PCR 
fragment (25-50 ng) amplified from the 5' end of the element. Probes were P-32 (50 
|iCi) labeled using random priming. Following overnight hybridization at 65 °C, 
blots were washed 2X (0.1 5X SSC, 0.1% SDS) for 30-45 minutes at 65 °C. 
Hybridized blots were then exposed to Kodak BioMax film. Southern analysis with 
methylation sensitive restriction enzymes was conducted on B73 and Mo 1 7 using the 
same protocols as for genotyping except that 5 |ig of DNA was digested. Enzymes 
included in the study were the differentially methylation sensitive isoschizomers 
HpaYllMspl and EcdRlllBstHl as well as other methylation sensitive enzymes: Hhal, 
and Pstl. Blots were hybridized with probes representing different portions of the 
element. 

d. HPLC analysis. 

HPLC was conducted according to a modified protocol of Gehrke et al. (J. 
Chromato., 301:199-219 (1984)). B73 x Mol7 recombinant inbred lines carrying a 
SPRITE- 1 insertion were determined using PCR with the zmet2a primers 15F and 8R, 
5i 
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and the SPRITE- 1 primer 18R. Preparations for each of four plants with and without 

SPRITE-! were analyzed. Twenty-five micrograms of DNA was diluted with water 

to a volume of 50 denatured at 96 °C for 5 minutes and immediately placed on ice. 

One hundred microliters of 30 mM ammonium acetate (pH 5.3), 5 |il of 20 mM Zinc 

5 Sulfate and 10 ^1 Nuclease PI (Img/'ml in 30 mM ammonium acetate (pH 5.3) was 

J added and incubated at 37 °C for 2 hours. This reaction cleaves 5' mononucleotides 

r 

from single stranded DNA. The pH was adjusted with 20 f.il of Tris (pH 8.5) and 
approximately 15 units of Calf Intestinal Alkaline Phosphatase was added and 
incubated at 37 C for an additional 2 hours which converts the nucleotides to 
10 nucleosides. Samples were frozen at -20 °C until HPLC analysis. 

HPLC analysis was conducted at the University of Wisconsin Biotechnology 
Center. Madison, Wisconsin. A volume of 40 ^1 was injected into a Browniee Lab 
^ Spheri-5 RP-8 column. Nucleosides were separated with a flow rate of 0.75 ml/min 

=|r! 15 using a gradient program consisting of 30 minutes in buffer A (0.05M Potassium 

JiJ'J Phosphate pH 4.0, 2.5% methanol), 19 minutes in buffer B (0.05M Potassium 

iM' Phosphate pH 4.0, 20% methanol). The column was flushed with 70% methanol for 

m 13 minutes and then re-equilibrated with buffer A for 23 minutes before the injection 

y^'' of the next sample. All samples were analyzed on a Beckman System Gold 

i|n 20 chromatograph and nucleosides detected at A260 nm and A280 nm. Nucleoside and 

^vj nucleotide standards (Sigma) were used to determine nucleoside peak positions and to 

create a standard cur\'e to determine nucleoside concentration. The ratio of 5- 
methylcytosine to total cytosine was calculated and statistical analysis conducted 
using SAS. 

25 

e. Expression analysis. 

The expression of SPRITE- 1 was determined by hybridizing a SPRITE- 1 probe to 
a Southern blot of cDNA's prepared from different tissues and tissues at different 
stages of development. Tissues included in this study are embryos 24 days after 
30 pollination. 10 day seedlings, immature ear. immature tassel, immature leaf from 

mutant and nonmutant plants, roots. BMS callus, and mature pollen. Total RNA was 
extracted using Trizol (Gibco/BRL) according to the manufacmre's protocol. The 
PolyAttract System (Promega) was used to isolate mRNA's from all tissues except 10 
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day seedlings which was isolated using oligo dT cellulose columns (Pharmacia). 
cDNA was synthesized from the isolated RNA's using Marathon cDNA 
Amplification Kit (Clontech). 

5 f. Results 

SPRITE-1 is similar to retrotransposons of the Tyl/copia group. 

In the process of sequencing the maize methyltransferase gene zmet2a, a 
retroelement insened within an intron of this gene was discovered and names 
SPRITE-1. This element is positioned in opposite transcriptional orientation relative 

10 to zmet2a. The insenion spans 5220 bp and possesses all the components of a 
retroelement. Sequence data indicates that SPRITE- 1 is a Long-Terminal-Repeat 
(hereinafter "LTR") retroelement belonging to the Tyl/copia class of retroelements. 
FIG. 16a depicts the general structural components of SPRITE-1. FIG. 1 6b shows the 
sequence of the terminal structural components. SPRITE-1 has a perfect 109 bp 

15 direct terminal repeats which includes a 3 bp inverted repeat that flanks the internal 
element sequence. These repeats have the TG...CA pattern found in most plant 
retroelements and are also shorter than LTR's of most retroelements. LTR's range in 
size from 115 bp to 4560 bp from information compiled by Bennetzen (Trends in 
Microbiology-, 9:347-353 (1996)). A 5 bp host site duplication flanks the repeats 

20 externally. Downstream and adjoining the 5' LTR is a primer binding site (PBS) of 
1 6 bp that has sequence complementary to the wheat germ cytoplasmic initiator 
methiomne tRNA (Ghosh et al., Nuc. Acids. Res.. 10:3241-3247 (1982)). Upstream 
and adjoining the 3' LTR is a polypurine tract of 9 bp. Between the putative 
transcription start site to the predicted translation start site is a 550 bp untranslated 

25 region. SPRITE-1 contains a single open reading frame coding 1485 amino acids 
ending with the stop codon at the 5' end of the polypurine tract. 

^tf^ ^S^tabase searches for similar coding sequences using BLAST 
(http://ww\ncbi.nlm.nih.gov/gov/BLAST/) show that SPRITE- 1 belongs to a 
30 different famiW of retroelements than any other previously described. The most 
closely related elements based on overall amino acid similarity include an 
Arabidopsis retrotelement (AC006528), Retrofit from Oiyza longstaminata (U72725), 
and Hopscotch from Zea mays (U 12626) all having -35% identity and -50% 
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conservation in amino acid sequence with SPRITE- ! . It also shares 29% identity and 
45%^nser\'ation with the copia element from Drosophila. No elements were found 
to have nucleotide similarity with the LTR of SPRITE- 1 further indicating that this is 
a member oCa unique family of Ty l/cop/a type elements. 

SPRITE- 1 has the component retro virus-like amino acid motifs that code for 
the proteins necessarv' for transposition. These motifs are the gag-related protein that 
contains a Cys-His box also known as the CCHC zinc-binding domain, a protease, an 
integrase. reverse transcriptase and RNase H. These motifs are ordered as they are in 
Tyl and copia. FIG. 17 shows amino acid alignments of these conserved region from 
the similar retroelements previously mentioned. These motifs were similarly 
positioned relative to each other in these retroelements except the CCHC zinc binding 
domain which was more variant in position relative to the protease motif. This motif 
was aligned by hand whereas the alignments of the other motifs were constructed by 
CLUSTAL W and processed using BOXSHADE. Alignments indicate that SPRITE- 
1 does possess the component protein coding regions necessary for replication and 
transposition. The coding regions of many retroelements have shown mutations that 
create frameshifts or introduce stop codons thus preventing translation of functional 
proteins and preventing transposition. The coding region of SPRITE- 1 is intact and 
therefore has the potential to transpose. 

The number of copies of SPRITE-1 is relatively low but variable. 

A survey of inbred lines developed from several different populations and 
other genetic stocks revealed differences in SPRITE-1 copy number. DNA was 
digested with £co RI and southern blots hybridized with a probe representing the 5' 
untranslated region of SPRITE-1 . This element does not have any £coRI restriction 
sites. SPRITE- 1 is found at a low copy number in most maize lines. Copy number 
varies from 3 as in B73 and Mol 7 to 5 as in B14 and B79 (FIG. 18). The insertion of 
SPRITE-1 into zmet2a is only found in Mo 17 and not in any other maize inbred line 
except A682. a line derived from Mol7 (FIG. 19). C.I. 187-2, a Mol7 parental line, 
does not contain SPRITE-1. This indicates that SPRITE-1 has been active recently, 
i.e. after the origin of the maize populations used for inbred development. 
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Expression ofSPRITE-1 

Expression was investigated by hybridizing a southern blot of cDNAs, 
sythesized from mRNA from different maize tissues, with a SPRITE- 1 probe (FIG. 

20) . Expression of SRITE- 1 was highest in leaf tissue. Expression was highest in 

5 leaf tissue from plants with a MUTATOR insertion in zmet2a and decreased CpNpG 
methylation. A low level of expression was observed in most tissues, but this may be 
due to transcription of other genes containing SPRITE- 1 in a sense orientation. 

SPRITE-1 does not effect zmet2a transcript processing. 

10 During the sequencing of zmet2a cDNA, no fragments or subclones possessed 

SPRITE-1 sequence indicating that it is efficiently spliced from the transcript. 
Aberrant splicing has been observed in genes containing retroelements (Pouteau et al., 
Mol. Gen. Genet., 228:233-239 (1991), Varagona et al., Plant Cell, 4:811-820 (1992), 
Marillonnet and Wessler, Plant Cell, 9:967-978 (1997), Kapitonov and Jurka, J. Mol. 

15 EvoL, 48:248-251 (1999)). Expression of three alleles of the waxy gene of maize was 
low due to retroelement insertions within introns (Varagona et al., Plant Cell, 4:81 1- 
820 (1992)). Varagona et al. {Plant Cell, 4:81 1-820 (1992)) found that although the 
element was spliced out of the waxy transcript, long-range splice site recognition was 
disrupted as exons upstream and downstream of the insertion site were found to be 

20 excluded in some transcripts. Further analysis of the wxG allele showed tissue 
specific differences in RNA processing with more correctly spliced transcripts in 
pollen than in the endosperm (Marillonnet and Wessler, Plant Cell. 9:967-978 
(1997)). 

Alternatively spliced transcripts were searched for by PGR amplification of 
25 fragments spanning several exons both upstream and downstream of the SPRITE- 1 
insertion site. Fragments were amplified from Mol 7 seedling and immature embryo 
cDNA and compared to fragments amplified from B73 immature ear cDNA (FIG. 

21) . Amplification products were separated on an agarose gel and southern blotted. 
The Southern blot was hybridized to a near full length zmet2a cDNA. No differences 

30 were observed between the B73 and Mo 17 products indicating that only correctly 

spliced fragments were detected. The blot was stripped and probed with retroelement 
sequences. No transcripts were amplified that contained any SPRITE-1 sequence. In 
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the tissues examined in this example, no aberrant transcripts were detected. Aberrant 
splicing products may be at such a low concentration that they are not detectable. 

SPRJTE-1 does not effect zmet2a expression and function. 

5 Since SPRITE-! is inserted into an inlron of zmet2a, the effect of this insertion 

on zmet2a activity was investigated. HPLC data shows no methylation differences 
among the recombinant inbred lines with or without a SPRITE- 1 insertion in zmet2a. 
Lines with a SPRITE- 1 insertion had 18.21% ± 1.78 5-methylcytosine whereas lines 
without the insertion had 18.20% ± 0.24. It is probable that most transcripts are 

10 processed correctly since no changes in methylation are observed in plants with a 
SPRITE- 1 insertion. 

Regions of SPRITE-1 are hypermethylated 

Portions of SPRITE- 1 were examined to determine the status of cytosine 
15 methylation. Using methylation sensitive restriction enzymes, sites within 970 bp of 
the untranslated region (hereinafter "UTR") immediately downstream from the 
transcription start site was analyzed. FIG. 22 shows methylation sensitive restriction 
digestion patterns for Mol 7 and B73. The isoschizomers Hpall and Mspl recognize 
CCGG sequences and are differentially sensitive to methylation. SPRITE- 1 has a 
20 single Mspl/Hpall site. Using the SPRITE-1 sequence from Mo 17, the zmet2a 
insertion of SPRITE-1 would generate fragments of 5853 bp and 4625 bp. Other 
SPRITE-1 insertions would generate fragments of variable lengths. Southern blots 
show only very large fragments >20 Kb for both Hpall and Mspl. Mspl does show a 
smaller fragment size than Hpall but is much larger than the expected size for the 
25 zmet2a insertion. This indicates that this site is methylated in most SPRITE-1 copies. 

Another pair of isoschizomers BstNl and £coRII recognize the sequence 
CC(A/T)GG. BstNl is not sensitive to methylation and EcoRIl will not cut when the 
internal cytosine is methylated. Bstfil should generate SPRITE-1 -specific fragments 
30 of 6, 54, 1 35, 252, and 784 bp with the UTR probe. All £coRJI fragments were 
greater than 20 Kb indicating complete methylation of these sites. Hhal which 
recognizes GCGC sites should generate SPRITE-1 -specific fragments of 2884 and 
257 bp and a zmet2a insertion fragment of 2965 bp. No fragments this small were 
56 
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observed indicating methylation at these sites. The Ps!l site recognized with this 
probe was also methylated. 

EXAMPLE 2 - Cloning and Sequencing of zmet2b 

5 A lambda library (Stratagene) constructed from Mol 7 maize genomic DNA 

library was screened with the zmet2a methyltransferase nucleic sequences shown in 
FIG. 1 . This screening resulted in the recovery of seven (7) independent clones. Four 
of these clones corresponded exactly to zmet2a nucleic acid sequence. Another type, 
represented by only one clone, had limited homology in non-significant regions. Two 

1 0 other clones were very similar to the zmet2a methyltransferase nucleic acid sequence 
but were definitely not identical to the zmet2a methyltransferase nucleic acid 
sequence. These clones defined a second gene, referred to as "zmet2b". Primer 
walking resulted in a partial genomic sequence of zmet2b. Primers specific to zmet2b 
were designed and used to amplify zmet2b cDNA (using Marathon cDNA 

15 Amplification Kit from Clontech according to the manufacturer's protocols). The 
RACE products were isolated and cloned into p-GEMT-Easy (Promega). Sequence 
of the RACE products generated a partial cDNA sequence for the 3' end of the gene 
(see FIG. 23). A partial amino acid sequence encoded by this cDNA sequence is 
shown in FIG. 24. A comparison of a portion of the amino acid sequences for zmet2a 

20 and zmet2b is shown in FIG. 25. 

All references cited herein are hereby incorporated by reference. 

The present invention is illustrated by way of the foregoing descnption and 
25 examples. The foregoing description is intended as a non-limiting illustration, since 
many variations will become apparent to those skilled in the art in view thereof It is 
intended that ail such variations within the scope and spirit of the appended claims be 
embraced thereby. 

30 Changes can be made to the composition, operation and arrangement of the 

method of the present invention described herein without departing from the concept 
and scope of the invention as defined in the following claims. 
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